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Abstract 

We propose new succinct representations of ordinal trees, which have been studied exten- 
sively. It is known that any n-node static tree can be represented in 2n + o(n) bits and a 
number of operations on the tree can be supported in constant time under the word-RAM 
model. However the data structures are complicated and difficult to dynamize. We propose a 
simple and flexible data structure, called the range min-max tree, that reduces the large num- 
ber of relevant tree operations considered in the literature to a few primitives that are carried 
out in constant time on sufficiently small trees. The result is extended to trees of arbitrary 
size, achieving 2n + C'(n/polylog(n)) bits of space, which is optimal for some operations. The 
redundancy is significantly lower than any previous proposal. For the dynamic case, where 
insertion/deletion of nodes is allowed, the existing data structures support very limited oper- 
ations. Our data structure builds on the range min-max tree to achieve 2n ~\- 0{n/ \ogn) bits 
of space and Oilogn) time for all the operations. We also propose an improved data structure 
using 2?! + ©(nloglogn/logn) bits and improving the time to the optimal O (log n/ log log n) 
for most operations. We extend our support to forests, where whole subtrees can be attached 
to or detached from others, in time ©(log^^*^ n) for any e > 0. 

Our techniques are of independent interest. An immediate derivation gives improved solution 
to range minimum/maximum queries where consecutive elements differ by ±1, achieving 0{n + 
n/polylog(n)) bits of space. A second one stores an array of numbers supporting operations 
sum and search and limited updates, in optimal time O(logn/loglogn). A third one allows 
representing dynamic bitmaps and sequences supporting rank/select and indels, within zero- 
order entropy bounds and optimal time ©(logn/loglogTi) for all operations on bitmaps and 
polylog-sized alphabets, and ©(log n log ct/ (log log n)^) on larger alphabet sizes a. This improves 
upon the best existing bounds for entropy-bounded storage of dynamic sequences, compressed 
full-text self-indexes, and compressed-space construction of the Burrows- Wheeler transform. 

1 Introduction 

Trees are one of the most fundamental data structures, needless to say. A classical representation 
of a tree with n nodes uses 0{n) pointers or words. Because each pointer must distinguish all the 
nodes, it requires logn bit^ in the worst case. Therefore the tree occupies 0(nlogn) bits. This 
causes a space problem for storing a large set of items in a tree. Much research has been devoted to 
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reducing the space to represent static trees l2l[MllHIMl[l9l[2S[6l[I3ia[ini[28l[H[3[2ll[171[2^ 

and dynamic trees [351 HSj El [I] , achieving so-called succinct data structures for trees. 

A succinct data structure stores objects using space close to the information-theoretic lower 
bound, while simultaneously supporting a number of primitive operations on the objects in con- 
stant time. Here the information-theoretic lower bound for storing an object from a universe with 
cardinality L is log L bits because in the worst case this number of bits is necessary to distinguish 
any two objects. 

In this paper we are interested in ordinal trees, in which the children of a node are ordered. The 
information-theoretic lower bound for representing an ordinal tree with n nodes is 2n — G(logn) 
bits because there exist (^^~^)/{2n — 1) = 2^"/0(n2) such trees [33j. The size of a succinct data 
structure storing an object from the universe is typically (1 + o(l)) logL bits. We assume that the 
computation model is the word RAM with word length G(logn) in which arithmetic and logical 
operations on 0(logn)-bit integers and 0(logn)-bit memory accesses can be done in constant time. 

Basically there exist three types of succinct representations of ordinal trees: the balanced paren- 
theses sequence (BP) [261 133]) the level-order unary degree sequence (LOUDS) [26l HHI, and the 
depth-first unary degree sequence (DFUDS) [6l |27|. An example of them is shown in Figure [TJ 
LOUDS is a simple representation, but it lacks many basic operations, such as the subtree size of 
a given node. Both BP and DFUDS build on a sequence of balanced parentheses, the former using 
the intuitive depth-first-search representation and the latter using a more sophisticated one. The 
advantage of DFUDS is that it supports a more complete set of operations by simple primitives, 
most notably going to the i-th child of a node in constant time. In this paper we focus on the BP 
representation, and achieve constant time for a large set of operations, including all those handled 
with DFUDS. Moreover, as we manipulate a sequence of balanced parentheses, our data structure 
can be used to implement a DFUDS representation as well. 

1.1 Our contributions 

We propose new succinct data structures for ordinal trees encoded with balanced parentheses, in 
both static and dynamic scenarios. 

Static succinct trees. For the static case we obtain the following result. 

Theorem 1 For any ordinal tree with n nodes, all operations in Table [7] except insert and delete 
are carried out in constant time 0{c) with a data structure using 2n + 0{n/log'^n) bits of space on 
a @(logn)-bit word RAM, for any constant c > 0. The data structure can be constructed from the 
balanced parentheses sequence of the tree, in 0{n) time using 0{n) bits of space. 

The space complexity of our data structures significantly improves upon the lower-order term 
achieved in previous representations. For example, the extra data structure for level-ancestor re- 
quires 0{n log log n j Vlog n) bits [36], or C'(n(loglogn)^/logn) bit^[27J, and that for c/iiW requires 
0(n/(loglogn)^) bits [28]. Ours requires C'(n/log'^n) bits for all of the operations. We show in 
the Conclusions that this redundancy is optimal for some operations. 

The simplicity and space-efficiency of our data structures stem from the fact that any query 
operation in Table [T] is reduced to a few basic operations on a bit vector, which can be efficiently 

^This data structure is for DFUDS, but the same technique can be also applied to BP. 
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solved by a range min-max tree. This approach is different from previous studies in which each 
operation needs distinct auxihary data structures. Therefore their total space is the summation 
of all the data structures. For example, the first succinct representation of BP [33j supported 
only findclose, findopen, and enclose (and other easy operations) and each operation used different 
data structures. Later, many further operations such as ImostJeaf lea jl7|, degree [9], child 
and child-rank [25], leveLancestor |[36j, were added to this representation by using other types of 
data structures for each. There exists another elegant data structure for BP supporting findclose, 
findopen, and enclose [19]. This reduces the size of the data structure for these basic operations, 
but still has to add extra auxiliary data structures for other operations. 



Dynamic succinct trees. Our approach is suitable for the dynamic maintenance of trees. For- 
mer approaches in the static case use two-level data structures to reduce the size, which causes 
difficulties in the dynamic case. On the other hand, our approach using the range min-max tree 
is easily applied in this scenario, resulting in simple and efficient dynamic data structures. This is 
illustrated by the fact that all the operations are supported. The following theorem summarizes 
our results. 

Theorem 2 On a Q(logn)-bit word RAM, all operations on a dynamic ordinal tree with n nodes 
can be carried out within the worst-case complexities given in Table\^ using a data structure that 
requires 2n + ©(nloglogn/logn) bits. Alternatively, the operations of the table can be carried out 
in O(logn) time using 2n + 0{n/ logn) bits of space. 

Note we achieve time complexity ©(logn/ log logn) for most operations, including insert and 
delete, if we solve degree, child, and child-rank naively. Otherwise we can achieve ©(logn) com- 
plexity for these, yet also for insert and deZeiej^ The time complexity 0(logn/ log logn) is optimal: 
Chan et al. jH] Thm. 5.2] showed that just supporting the most basic operations of Table [T](/inc?open, 
findclose, and enclose, as we will see) plus insert and delete, requires this time even in the amortized 
sense, by a reduction from Fredman and Saks's lower bounds on rank queries |17j . 

Moreover, we are able to attach and detach whole subtrees, in time 0{log^~^'' n) for any constant 



e > (see Section 2.3 for the precise details). These operations had never been considered before 



in succinct tree representations. 



Byproducts. Our techniques are of more general interest. A subset of our data structure is able to 
solve the well-known "range minimum query" problem [3] . In the important case where consecutive 
elements differ by ±1, we improve upon the best current space redundancy of ©(n log logn/ logn) 
bits [H]. 

Corollary 1 Let E\^,n — V\ be an array of numbers with the property that E[i]—E[i — 1\ G { — 1,+1} 
for < i < n, encoded as a bit vector P[0,n — 1] such that P[i] = 1 if E[i] — E[i — 1] = +1 and 
P[i] = otherwise. Then, in a RAM machine we can preprocess P in 0{n) time and 0{n) bits 
such that range maximum /minimum queries are answered in constant 0{c) time and 0{n/ \og^ n) 
extra bits on top of P. 

^In the conference version of this paper (49) we erroneously affirm we can obtain O (log n/ log log n) for all these 
operations, as well as leveLancestor, leveLnext/ leveLprev, and leveLlmost/ leveLrmost, for which we can actually 
obtain only ©(logn). 
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Another direct application, to the representation of a dynamic array of numbers, yields an 
improvement to the best current alternative |29] by a 0(loglogn) time factor. If the updates 
are limited, further operations sum (that gives the sum of the numbers up to some position) and 
search (that finds the position where a given sum is exceeded) can be supported, and our complexity 
matches the lower bounds for searchable partial sums by Patra§cu and Demaine |40) (if the updates 
are not limited one can still use previous results [29J, which are optimal in that general case). We 
present our result in a slightly more general form. 

Lemma 1 A sequence of n variable-length constant-time self- delimiting bit codes xi . . . x„, where 
\xi\ = O(logn), can be stored within {J2 + o(l)) bits of space, so that we can (i) compute 

any sequence of codes Xi, . . . ,Xj, [ii) update any code xi ^ y, {Hi) insert a new code z between any 
pair of codes, and (iv) delete any code Xd from the sequence, all in O (log n/ log log n) time (plus 
j — i for {i))- Moreover, let f{xi) be a nonnegative integer function computable in constant time 
from the codes. If the updates and indels are such that \ f{y) — f{xi)\,f{z),f{xd) = 0{}ogn), then 
we can also support operations sum{i) = X]j=i fi^i) '^^^^ search{s) = max{i, sum{i) < s} within 
the same time. 

For example we can store n numbers < < 2*^ within kn + o{kn) bits, by using their /c-bit 
binary representation [aj]2 as the code, and their numeric value as f{[ai]2) = ai, so that we support 
sum and search on the sequence of numbers. If the numbers are very different in magnitude we 
can (5-encode them to achieve {J2 log ai){l + o(l)) + 0{n) bits of space. We can also store bits, seen 
as 1-bit codes, in n + o(n) bits and and carry out sum = rank and search = select, insertions and 
deletions, in 0(logn/loglogn) time. 

A further application of our results to the compressed representation of sequences achieves a 
result summarized in the next theorem. 

Theorem 3 Any sequence S[0,n—1] over alphabet [1, a] can be stored innHQ(S)-\-0 {nlog a/ log'' n+ 
alog'' n) bits of space, for any constant < e < 1, and support the operations rank, select, insert. 



This time complexity slashes the the best current result |22j by a G (log log n) factor. The 

optimality of the polylogarithmic case stems again from Fredman and Saks' lower bound on rank 
on dynamic bitmaps [17J. This result has immediate applications to building compressed indexes 
for text, building the Burrows- Wheeler transform within compressed space, and so on. 

1.2 Organization of the paper 

In Section [2] we review basic data structures used in this paper. Section |3] describes the main ideas 
for our new data structures for ordinal trees. Sections |4] and [5] describe the static construction. In 
Sections |6] and [7] we give two data structures for dynamic ordinal trees. In Section [8] we derive our 
new results on compressed sequences and applications. In Section [9] we conclude and give future 
work directions. 

*This means that one can distinguish the first code Xi from a bit stream xta in constant time. 




For polylogarithmic- sized alphabets, this is the 



optimal O (log n/ log log n); otherwise it is O 
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Table 1: Operations supported by our data structure. The time complexities are for the dynamic 
case; in the static case all operations are performed in constant time. The first group is composed 
of basic operations, used to implement the others, but which could have other uses. 



Operation 


description 


time complexity 
variant 1 | variant 2 




— -TTV 

inspect {ij 




0(log n/ log lo 


? n) 




findclose(^i^ j findopeni^z) 


position of pcLr6ntli6sis ni&tcliiiig ^[^] 


0(log n/ log lo 


i n) 




enclosei^i^ 


position of tightest open psirent. enclosing i 


©(log n/log lo 


? n) 






number of open/close parentheses in J^[0, 


©(log n/ log log n) 




selecimvi) / select^ yi) 


position of i-th. open/close parenthesis 


©(log n/ log log n) 




rmqi{i,j)/RMQi{i,j) 


position of min/max excess value in range 


©(log n/ log lo 


? n) 




pre -rank (i) / post. rank (i) 


preorder/postorder rank of node i 


©(log n/ log log n) 




pre -select{i) / post -select{i) 


the node with preorder/postorder i 


©(log n/ log log n) 




isleaf{i) 


whether P[i\ is a leaf 


©(logn/logloj 


in) 




isancestor{i,j) 


whether i is an ancestor of j 


©(log n/ log log n) 




depth{i) 


depth of node i 


©(log n/log lo 


in) 




parent{i) 


parent of node i 


©(log n/ log log n) 




first-child{i) / last-child{i) 


first /last child of node i 


©(log n/ log log n) 




next_sibling(i) / prevsiblingii) 


next/previous sibling of node i 


©(log n/ log lo 


in) 




subtree_size{i) 


number of nodes in the subtree of node i 


©(log n/ log lo 


in) 




leveLancestorii, d) 


ancestor j of i s.t. depth{j) — depth{i) — d 


©(logn) 






leveLnext{i) / leveLprev{i) 


next/previous node of i in BFS order 


e'(logn) 






levelJmost{d) /leveLrmost{d) 


leftmost /rightmost node with depth d 


O(logn) 






lca{i,j) 


the lowest common ancestor of two nodes i,j 


©(log n/log lo 


in) 




deepest_node{i) 


the (first) deepest node in the subtree of i 


©(log n/log lo 


in) 




height{i) 


the height of i (distance to its deepest node) 


©(log n/log lo 


in) 




degree{i) 


q = number of children of node i 


© (g log n / log log n) 


©(log 


n) 


child{i, q) 


g-th child of node i 


© (g log n / log log n) 


©(log 


n) 


child_rank{i) 


q — number of siblings to the left of node i 


0{q log n / log log n) 


©(log 


n) 


in_rank{i) 


inorder of node i 


©(log n/log lo 


in) 




in-select{i) 


node with inorder i 


©(log n/loglo 


in) 




leaf -rank (i) 


number of leaves to the left of leaf i 


©(log n/ log lo 


I n) 




leaf -select{i) 


i-th leaf 


©(log n/ log lo 


I n) 




ImostJeaf (i) / rmostJeaf {i) 


leftmost/rightmost leaf of node i 


©(log n/log lo 


in) 




insert(i,j) 


insert node given by matching parent, at i and j 


© (log n / log log n) 


©(log 


n) 


delete{i) 


delete node i 


© (log n 1 log log n) 


0(log 


n) 
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Figure 1: Succinct representations of trees. 



2 Preliminaries 

Here we describe the balanced parentheses sequence and basic data structures used in this paper. 



2.1 Succinct data structures for rank/ select 

Consider a bit string S[Q,n — l\ of length n. We define rank and select for S as follows. rankc{S, i) is 
the number of occurrences c G {0, 1} in S[0, i], and selectc{S, i) is the position of the i-th occurrence 
of c in S. Note that rankc{S, selectc{S, i)) = i and selectc{S, rankc{S, i)) < i. 

There exist many succinct data structures for rank/ select [MllMlIll]- A basic one uses n + o(n) 
bits and supports rank/ select in constant time on the word RAM with word length O(logn). 
The space can be reduced if the number of I's is small. For a string with m I's, there exists a 
data structure for constant-time rank/ select using nHo{S) + O (n log log n/ log n) , where Hq{S) = 
™ log ^ + """^ log ^ ".^ = m log ^ + 0{m) is called the empirical zero-order entropy of the sequence. 
The space overhead on top of the entropy has been recently reduced [42j to 0{nt^ / log* n + n^/^) 
bits, while supporting rank and select in 0{t) time. This can be built in linear worst-case tim^ 

A crucial technique for succinct data structures is table lookup. For small-size problems we 
construct a table which stores answers for all possible sequences and queries. For example, for rank 
and select, we use a table storing all answers for all 0,1 patterns of length ^ logn. Because there 
exist only 25 = ^Jn different patterns, we can store all answers in a universal table (i.e., not 
depending on the bit sequence) that uses ^/n ■ polylog(n) = o(n/polylog(n)) bits, which can be 
accessed in constant time on a word RAM with word length 0(logn). 

The definition of rank and select on bitmaps generalizes to arbitrary sequences over an integer 
alphabet [1,ct], as well as the definition of zero-order empirical entropy of sequences, to Hq{S) = 
J2i<c<a TT jT' "^here c occurs Uc times in S. A compressed representation of general sequences 
that supports rank/ select is achieved through a structure called a wavelet tree [23\- This is a 
complete binary tree that partitions the alphabet [1,(t] into contiguous halves at each node. The 
node then stores a bitmap telling which branch did each letter go. The tree has height [log a] , 
and it reduces rank and select operations to analogous operations on its bitmap in a root-to-leaf 
or leaf-to-root traversal. If the bitmaps are represented within their zero-order entropy, the total 
space adds up to nHQ{S) + o(n log a) and the operations are supported in 0(loga") time. This can 
be improved to while maintaining the same asymptotic space, by using a multiary 

^They use a predecessor structure by Patra§cu and Thorup [3T], more precisely their result achieving time 
"lg^=^", which is a simple modification of van Emde Boas' data structure. 
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wavelet tree of arity Q{^/logn), and replacing the bitmaps by sequences over small alphabets, which 
still can answer rank/ select in constant time |13j . 

2.2 Succinct tree representations 

A rooted ordered tree T, or ordinal tree, with n nodes is represented by a string P[0, 2n — 1] of 
balanced parentheses of length 2n. A node is represented by a pair of matching parentheses [3 . . . H 
and all subtrees rooted at the node are encoded in order between the matching parentheses (see 
Figure [l] for an example). A node f G T is identified with the position i of the open parenthesis 
P[i] representing the node. 

There exist many succinct data structures for ordinal trees. Among them, the ones with maxi- 
mum functionality [11] support all the operations in Table [l| except insert and delete, in constant 
time using 2n + C'(nlogloglogn/loglogn)-bit space. Our static data structure supports the same 
operations and reduces the space to 2n + C'(n/polylog(n)) bits. 

2.3 Dynamic succinct trees 

We consider insertion and deletion of internal nodes or leaves in ordinal trees. In this setting, there 
exist no data structures supporting all the operations in Table [l} The data structure of Raman 
and Rao [l5] supports, for binary trees, parent, left and right child, and subtree.size of the current 
node in the course of traversing the tree in constant time, and updates in ©((log log n)^"'"'^) time. 
Note that this data structure assumes that all traversals start from the root. Chan et al. [8] gave a 
dynamic data structure using 0{n) bits and supporting findopen, findclose, enclose, and updates, 
in 0(logn/loglogn) time. They also gave another data structure using 0{n) bits and supporting 
findopen, findclose, enclose, lea, leaf-rank, leafselect, and updates, in 0{logn) time. 

Furthermore, we consider the more sophisticated operation (which is simple on classical trees) 
of attaching a new subtree as the new child of a node, instead of just a leaf. The model is that this 
new subtree is already represented with our data structures. Both trees are thereafter blended and 
become a unique tree. Similarly, we can detach any subtree from a given tree so that it becomes an 
independent entity represented with our data structure. This allows for extremely flexible support 
of algorithms handling dynamic trees, far away from the limited operations allowed in previous 
work. This time we have to consider a maximum possible value for logn (say, w, the width of the 
system- wide pointers). Then we require 2n + 0{n\ogw/w + \/^) bits of space and carry out the 
queries in time 0{w/ logw) or 0{w), depending on the tree. Insert or delete takes 0{w^~^'') for any 
constant e > if we wish to allow attachment and detachment of subtrees, which then can also be 
carried out in time 0{w^~^''). 

2.4 Dynamic compressed bitmaps and sequences 

Let B[0,n — 1] be a bitmap. We want to support operations rank and select on B, as well as 
operations insert{B ,i,h), which inserts bit b between B[i] and B[i + 1], and delete{B,i), which 
deletes position B[i] from B. Chan et al. [8] handle all these operations in ©(logn/ log logn) time 
(which is optimal [17j) using 0{n) bits of space (actually, by reducing the problem to a particular 
dynamic tree). Makinen and Navarro [29] achieve 0(log n) time and nHo{B)+0{n log log n/ -^/logn) 
bits of space. The results can be generalized to sequences. Gonzalez and Navarro [22j achieve 
uHq + 0{nloga/^/logn) bits of space and C'(logn(l + i^jfjf^)) time to handle all the operations 
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on a sequence over alphabet [1,0"]. They give several applications to managing dynamic text 
collections, construction of static compressed indexes within compressed space, and construction 
of the Burrows- Wheeler transform [7J within compressed space. We improve all these results in 
this paper, achieving the optimal ©(logn/loglogn) on polylog-sized alphabets and reducing the 
lower-order term in the compressed space by a 0(loglogn) factor. 

3 Fundamental concepts 

In this section we give the basic ideas of our ordinal tree representation. In the next sections we 
build on these to define our static and dynamic representations. 

We represent a possibly non-balancec^ parentheses sequence by a 0,1 vector P[0,n— 1] {P[i] S 
{0, 1}). Each opening/closing parenthesis is encoded by H = 1, Q] = 0. 

First, we remind that several operations of Table [T] either are trivial in a BP representation, or 
are easily solved using enclose, findclose, findopen, rank, and select [33]. These are: 



inspect{i) 


= P[i\ (or ranki{P,i) — ranki{P,i — 1) if there is no access to P[i] 


isleaf{i) 


= [P[i + l] = 0] 


isancestor{i, j) 


= i ^ j ^ findclose{P, i) 


depth{i) 


= ranki{P, i) — ranko{P, i) 


parental) 


= enclose{P, i) 


pre-rank{i) 


= ranki{P,i) 


pre.select{i) 


= selecti{P,i) 


post-rank{i) 


= ranko{P,i) 


post _s elect (i) 


= selectQ{P,i) 


first-child{i) 


= i + 1 (if P[i + 1] = 1, else z is a leaf) 


last-child{i) 


= findopen{P, findclose{P, i) — 1) (if P[i + 1] = 1, else z is a leaf) 


next-sihling{i) 


= findclose{i) + 1 (if P[findclose{i) + 1] = 1, else i is the last siblin 


prev_sihling{i) 


= findopen{i — 1) (if P[i — 1] = 0, else i is the first sibling) 


suhtree-size{i) 


= {findclose{i) — i + l)/2 



Hence the above operations will not be considered further in the paper. Let us now focus on a 
small set of primitives needed to implement most of the other operations. For any function g{-) on 
{0, 1}, we define the following. 

Definition 1 For a 0,1 vector P[0,n — 1] and a function g{-) on {0, 1}, 

d f 

sum{P,g,i,j) = ^c/(P[/c]) 

k=i 

fwd-search{P, g ,i, d) min{j | sum{P, g,i, j) = d} 

j>i 

bwd.search{P, g,i,d) max{j | sum{P,g,j,i) = d} 

j<i 

®As later we will use these constructions to represent arbitrary segments of a balanced sequence. 
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rmq{P,g,i,j) 



def 



mill \sum(P, q, i 

i<k<j 



k)} 



rmqi{P,g,i,j) 



def 



argminj sitm(P, g, i, 

i<k<j 



k)} 



RMQ{P,g,i,j) 



def 




RMQi{P,g,i,j) 



def 



argmaxj sum(P, g, i, 

i<k<j 



k)} 



The following function is particularly important. 

Definition 2 Let vr be the function such that tt{1) = l,7r(0) = —1. Given P[0,n — 1], we define 
the excess array E[0, n — 1] of P as an integer array such that E[i] = sum(P, vr, 0, i). 

Note that E[i] stores the difference between the number of opening and closing parentheses in 
P[0,i]. When P[i] is an opening parenthesis, E[i] = depth{i) is the depth of the corresponding 
node, and is the depth minus 1 for closing parentheses. We will use £' as a conceptual device in our 
discussions, it will not be stored. Note that, given the form of vr, it holds that \E[i + 1] — E[i]\ = 1 
for all i. 

The above operations are sufficient to implement the basic navigation on parentheses, as the 
next lemma shows. Note that the equation for findclose is well known, and the one for leveLancestor 
has appeared as well [36] , but we give proofs for completeness. 

Lemma 2 Let P be a BP sequence encoded by {0,1}. Then findclose, findopen, enclose, and 
leveLancestor can be expressed as follows. 



Proof. For findclose, let j > i be the position of the closing parenthesis matching the opening 
parenthesis at P[i]- Then j is the smallest index > i such that E[j] = E[i] — 1 = E[i — 1] (because 
of the node depths). Since by definition E[k] = E[i — 1] + sum{P,TT,i, k) for any k > i, j is the 
smallest index > i such that sum{P,7r,i, j) = 0. This is, by definition, fwd.search{P,TT,i,0). 

For findopen, let j < i he the position of the opening parenthesis matching the closing paren- 
thesis at P[i]. Then j is the largest index < i such that E[j — 1] = E[i] (again, because of the node 
depths 1^ Since by definition E[k — 1] = E[i] — sum{P,Tr, k,i) for any k < i, j is the largest index 
< i such that sum{P, n , j , i) = 0. This is bwd-search{P,-K,i,0). 

For enclose, let j < i he the position of the opening parenthesis that most tightly encloses the 
opening parenthesis at P[i]. Then j is the largest index < i such that E[j — 1] = E[i] — 2 (note 
that now P[i] is an opening parenthesis). Now we reason as for findopen to get sum{P, n , j , i) = 2. 

Finally, the proof for leveLancestor is similar to that for enclose. Now j is the largest index < i 
such that E[j — 1] = E[i] — d — 1, which is equivalent to sum(P, vr, j, i) = d + 1. □ 

^Note E[j] — 1 = E[i] could hold at incorrect places, where P[j] is a closing parenthesis. 



findclose{i) 
findopen{i) 
enclose{i) 



fwd-search{P, vr, i, 
bwd-search{P, vr, i 
bwd.search{P, vr, i 
bwd.search{P, vr, i 



0) 
0) 
2) 



leveLancestor{i, d) 



d + l) 
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We also have the following, easy or well-known, equalities: 



lca{i,j) 



max{i,j), \iisancestor{i, j) or isancestor{j,i 
parent{rmqi{P,iT,i,j) + 1), otherwise |46| 



deepest_node{i) = RMQi{P,iT,i,findclose{i)) 

height{i) = depth{deepest_node{i)) — depth{i) 

leveLnext{i) = fwd_search{P,TT,findclose{i),0) 

leveLprev{i) = findopen(bwd-search(P,7r,i,0)) 

leveLlmost{d) = fwd_search{P, vr, 0, d) 

leveLrmost{d) = findopen(bwd-search(P,7r,n — 1, —d)) 



We also show that the above functions unify the algorithms for computing rank/ select on 0,1 
vectors and those for balanced parenthesis sequences. Namely, let ip be functions such that 
0(0) = 0, (/)(1) = 1, ^(0) = 1, ^(1) = 0. Then the following equalities hold. 

Lemma 3 For a 0,1 vector P, 



Therefore, in principle we must focus only on the following set of primitives: fwdsearch, 
bwd-search, sum, rmqi, RMQi, degree, child, and child_rank, for the rest of the paper. 

Our data structure for queries on a 0,1 vector P is basically a search tree in which each leaf 
corresponds to a range of P, and each node stores the last, maximum, and minimum values of 
prefix sums for the concatenation of all the ranges up to the subtree rooted at that node. 

Definition 3 A range min-max tree for a vector P[0, n—1] and a function g{-) is defined as follows. 
Let [ii,ri], [i2,r2], • • • , [^gj'^g] be a partition of [0,n — 1] where £i = 0,rj + 1 = £j+i,rg = n — 1. 
Then the i-th leftmost leaf of the tree stores the sub-vector P[ii, ri], as well as e[i] = sum{P, g, 0, rj), 
m[i\ = e\i — l]-\-rmq{P,g,ll.i,ri) andM[i] = e[i — l]+RMQ{P, g,ii,ri) . Each internal node u stores in 
e[u\/m\v\/M[u\ the last/minimum/maximum of the e/m/M values stored in its child nodes. Thus, 
the root node stores e = sum{P, g,0,n — 1), m = rmq{P, g,0,n — 1) and M = RMQ{P, g,0,n — 1). 

Example 1 An example of range min-max tree is shown in Figure^ Here we use g = n, and thus 
the nodes store the minimum/maximum values of array E in the corresponding interval. 

4 A simple data structure for polylogarithmic-size trees 

Building on the previous ideas, we give a simple data structure to compute fwdsearch, bwdsearch, 
and sum in constant time for arrays of polylogarithmic size. Then we consider further operations. 

Let g{-) be a function on {0,1} taking values in {1,0,-1}. We call such a function ±1 
function. Note that there exist only six such functions where ^(0) ^ 5(1)) which are indeed 

0, -'AjV', -V',?'", -TT. 



ranki{P, i) 
selecti{P, i) 

ranko{P, i) 
selectQ^P, i) 



sum{P, (j), 0, i) 
fwd-search(P, (j), 0, i] 
sum{P, Tp, 0, i) 
fwd_search{P, ip, 0, i 
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a 




m/M 1/2 II 2/4 II 3/4 || 2/3 | 1/3|| 2/3 || 1/2 | 0/0 
E F^211&3^54^^32^1 23l ^321^l 21 b 

^ (()((()())())(()())()) 

Figure 2: An example of the range min-max tree using function vr, and showing the m/M values. 

Let w be the bit length of the machine word in the RAM model, and c > 1 any constant. We 
have a (not necessarily balanced) parentheses vector P[0,n — 1], of moderate size n < N = w'^. 
Assume we wish to solve the operations for an arbitrary ±1 function g{-), and let G[i] denote 
sum{P, g,0,i), analogously to E[i] for g = tt. 

Our data structure is a range min-max tree T^m for vector P and function g{-). Let s = ^w. 
We imaginarily divide vector P into [n/s] chunks of length s. These form the partition alluded in 
Definition [3} £i = s ■ {i — 1). Thus the values m[i] and M[i] correspond to minima and maxima of 
G within each chunk, and e[i] = G[ri]. 

Furthermore, the tree will be /c-ary and complete, for k = Q{w/{clogw)). Thus the leaves store 
all the elements of arrays m and M. Because it is complete, the tree can be represented just by 
three integer arrays e'[0,O{n/s)], m'[0,O{n/ s)], and M'[0,O{n/ s)], like a heap. 

Because —w'^ < e'[i], m'[i], M'[i] < for any i, arrays e', m! and M' occupy ^^-j- • ^ • \\og{2w^ + 
1)] = 0{nc\ogw/w) bits each. The depth of the tree is [logfc(n/s)] = 0{c). 

The following fact is well known; we reprove it for completeness. 

Lemma 4 Any range C [0,n — 1] in T^m is covered by a disjoint union of 0{ck) subranges 
where the leftmost and rightmost ones may be subranges of leaves ofTmM, and the others correspond 
to whole nodes ofTmM- 

Proof. Let a be the smallest value such that i < ra and b be the largest such that j > ib- Then 
the range [i,j] is covered by the disjoint union [i,j] = [i,ra][ia+i,i"a+i] ■ ■ ■ [ib,j] (we can discard the 
special case a = 6, as in this case we have already one leaf covering Then [i,ra] and [ib,j] 

are the leftmost and rightmost leaf subranges alluded in the lemma; all the others are whole tree 
nodes. 

It remains to show that we can reexpress this disjoint union using 0{ck) tree nodes. If all the 
k children of a node are in the range, we replace the k children by the parent node, and continue 
recursively level by level. Note that if two parent nodes are created in a given level, then all the 
other intermediate nodes of the same level must be created as well, because the original/created 
nodes form a range at any level. At the end, there cannot be more than 2k — 2 nodes at any level, 
because otherwise k of them would share a single parent and would have been replaced. As there 
are c levels, the obtained set of nodes covering is of size 0{ck). □ 
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Example 2 In Figure^ (where s = k = 3), the range [3, 18] is covered by [3, 5], [6, 8], [9, 17], [18, 18]. 
They correspond to nodes d, e, f, and a part of leaf k, respectively. 

Computing fwd-search{P,g,i,d) is done as follows (bwd^search is symmetric). First we check if 
the chunk of i, [^fcj '''fc] for k = [i/s\ , contains fwd.search{P, g, i, d) with a table lookup using vector 
P, by precomputing a simple universal table of 2* logs = 0{\/2}" \ogw) bit^ If so, we are done. 
Else, we compute the global target value we seek, d' = G[i — 1] + d = e[k] — sum{P, g,i,rk) + d 
(again, the sum inside the chunk is done in constant time using table lookup). Now we divide the 
range [r^ + 1, n — 1] into subranges /i, /2, . . . represented by range min-max tree nodes ui,U2, ■ ■ ■ 
as in Lemma |4] (note these are simply all the right siblings of my parent, all the right siblings of 
my grandparent, and so on). Then, for each Ij, we check if the target value d' is between m[uj] 
and M[Mj], the minimum and maximum values of subrange Ij. Let Ik be the first j such that 
m[nj] < d' < M[uj], then fwd-search{P,g,i,d) lies within J^. If Ik corresponds to an internal tree 
node, we iteratively find the leftmost child of the node whose range contains d' , until we reach a 
leaf. Finally, we find the target in the chunk corresponding to the leaf by table lookups, using P 
again. 

Example 3 In Figure^ where G = E and g = tt, computing findclose{3) = fwd-search{P, vr, 3, 0) = 
12 can be done as follows. Note this is equivalent to finding the first j > 3 such that E[i] = 
E[3 — 1] + = 1. First examine the node [3/s\ =1 (labeled d in the figure). We see that the target 
1 does not exist within d after position 3. Next we examine node e. Since m[e] =3 and M[e] = 4, e 
does not contain the answer either. Next we examine the node f. Because m[f] = 1 and M[f] = 3, 
the answer must exist in its subtree. Therefore we scan the children of f from left to right, and 
find the leftmost one with m[-] < 1, which is node h. Because node h is already a leaf, we scan the 
segment corresponding to it, and find the answer 12. 

The sequence of subranges arising in this search corresponds to a leaf-to-leaf path in the range 
min-max tree, and it contains 0{ck) ranges according to Lemma[4| We show now how to carry out 
this search in time 0{c) rather than 0{ck). 

According to Lemma|4| the 0{ck) nodes can be partitioned into 0{c) sequences of sibling nodes. 
We will manage to carry out the search within each such sequence in 0(1) time. Assume we have 
to find the first j >i such that rn[uj\ < d' < M[uj], where ui,U2, ■ ■ . ,Uk are sibling nodes in TmM- 
We first check if m[ui] < d' < M[ui]. If so, the answer is Uj. Otherwise, if d' < m[ui], the answer 
is the first j > i such that ?Ti[uj] < d' , and if d' > M[ui], the answer is the first j > i such that 
M[uj] > d'. 

Lemma 5 Let ui,U2, ... a sequence ofTmM nodes containing consecutive intervals of P. If g{-) is 
a ±1 function and d < m[ui], then the first j such that d G [m[iij], M[Mj]] is the first j > 1 such 
that d > rn[uj]. Similarly, if d > M[ui], then it is the first j > 1 such that d < M[uj]. 

Proof. Since g{-) is a ±1 function and the intervals are consecutive, M[uj] > m[uj-i] — 1 and 
m[uj] < M[uj-i] + 1. Therefore, if d > m[uj] and d < m[uj-i], then d < M[uj] + 1, thus 
d € [m[uj], M[uj]]; and of course d [m[ufc], M[t(fc]] for any k < j as j is the first index such that 
d > m[uj]. The other case is symmetric. □ 

^Using integer division and remainder a segment within a chunk can be isolated and padded in constant time. 
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Thus the problem is reduced to finding the first j > i such that m[j] < d' , among (at most) k 
sibling nodes (the case M[j] > d! is symmetric). We build a universal table with all the possible 
sequences of k values m[-] and all possible —w'^ < d' < values, and for each such sequence and 
d' we store the first j in the sequence such that m\j] < d' (or we store a mark telling that there is 
no such position in the sequence). Thus the table has {2w'^ + l)'^^^ entries, and log{k + 1) bits per 
entry. By choosing the constant oi k = Q{w/{clogw)) so that k < 2iog(2wg+i) ~ total space 

is 0{\/^ logw) (and the arguments for the table fit in a machine word). With the table, each 
search for the first node in a sequence of siblings can be done in constant rather than 0{k) time, 
and hence the overall time is 0{c) rather than 0{ck). Note that we store the m'[-] values in heap 
order, and therefore the k sibling values to input to the table are stored in contiguous memory, 
thus they can be accessed in constant time. We use an analogous universal table for M[-]. 

Finally, the process to solve sum{P,g,i,j) in 0{c) time is simple. We descend in the tree up 
to the leaf [ik,rk] containing j. We obtain sum{P,g,0,£k — 1) = e[A; — 1] and compute the rest, 
sum(P,g,ik,j), in constant time using a universal table we have already introduced. We repeat 
the process for sum{P, g,0,i — 1) and then subtract both results. 

We have proved the following lemma. 

Lemma 6 In the RAM model with w-bit word size, for any constant c > 1 and a 0,1 vector P of 
length n < w'^, and a ±1 function g{-) , fwd_search{P, g,i, j), bwd-search(P,g,i,j), and sum{P, g,i, j) 
can be computed in 0{c) time using the range min-max tree and universal lookup tables that require 
0{\/T"logw) bits. 

4.1 Supporting range minimum queries 

Next we consider how to compute rmqi{P, g,i, j) and RMQi{P, g,i, j). 

Lemma 7 In the RAM model with w-bit word size, for any constant c > 1 and a 0,1 vector P of 
length n < w^, and a ±1 function g{-), rmqi{P, g,i, j) and RMQi{P, g,i, j) can be computed in 0{c) 
time using the range min-max tree and universal lookup tables that require 0{V2^ logw) bits. 

Proof. Because the algorithm for RMQi is analogous to that for rmqi, we consider only the latter. 
From Lemma[4| the range [i, j] is covered by a disjoint union of 0{ck) subranges, each corresponding 
to some node of the range min-max tree. Let /xi,/Lt2, ... be the minimum values of the subranges. 
Then the minimum value in [i,j] is the minimum of them. The minimum values in each subrange 
are stored in array m' , except for at most two subranges corresponding to leaves of the range 
min-max tree. The minimum values of such leaf subranges are found by table lookups using P, 
by precomputing a universal table of 0(\/2^1ogw) bits. The minimum value of a subsequence 
lj,£, . . . which shares the same parent in the range min-max tree can be also found by table 
lookups. The size of such universal table is 0{{2w'^ + l)^k\ogk) = 0(\/2^) bits (the k factor is 
to account for queries that span less than k values, so we can specify the query length). Hence we 
find the node containing the minimum value /i among ^i, ;U2, • . ., in ©(c) time. If there is a tie, we 
choose the leftmost one. 

If /X corresponds to an internal node of the range min-max tree, we traverse the tree from the 
node to a leaf having the leftmost minimum value. At each step, we find the leftmost child of the 
current node having the minimum, in constant time using our precomputed table. We repeat the 
process from the resulting child, until reaching a leaf. Finally, we find the index of the minimum 
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value in the leaf, in constant time by a lookup on our universal table for leaves. The overall time 
complexity is 0{c). □ 



4.2 Other operations 

The previous development on fwdsearch, bwdsearch, rmqi, and RMQi, has been general, for any 
g{-). Applied to g = n, they solve a large number of operations, as shown in Section [sj For the 
remaining ones we focus directly on the case g = tt. 

It is obvious how to compute degree{i), child{i,q) and child_rank{i) in time proportional to the 
degree of the node. To compute them in constant time, we add another array n'[-] to the data 
structure. In the range min-max tree, each node stores the minimum value of a subrange for the 
node. In addition to this, we store in n'[-] the number of the minimum values of each subrange in 
the tree. 

Lemma 8 The number of children of node i is equal to the number of occurrences of the minimum 
value in E[i + 1, findclose{i) — 1]. 

Proof. Let d = E[i] = depth{i) and j = findclose{i). Then E[j] = d — 1 and all excess values 
m E[i + 1, j — 1] are > d. Therefore the minimum value in E[i + 1, j — 1] is d. Moreover, for 
the range [ik,jk] corresponding to the k-th. child of i, E[ik] = d + 1, E[jk] = d, and all the values 
between them are > d. Therefore the number of occurrences of d, which is the minimum value in 
E[i + 1, j — 1], is equal to the number of children of i. □ 

Now we can compute degree{i) in constant time. Let d = depth{i) and j = findclose{i). We 
partition the range E[i + 1, j — 1] into 0{ck) subranges, each of which corresponds to a node of the 
range min-max tree. Then for each subrange whose minimum value is d, we sum up the number 
of occurrences of the minimum value The number of occurrences of the minimum value in 

leaf subranges can be computed by table lookup on P, with a universal table using 0{^/2^ logw) 
bits. The time complexity is 0[c) if we use universal tables that let us process sequences of (up to) 
k children at once, that is, telling the minimum m[-] value within the sequence and the number of 
times it appears. This table requires 0{{2w'^ + l)'^A;logA;) = C'(\/2^) bits. 

Operation child-rank{i) can be computed similarly, by counting the number of minima in 
E[parent{i),i — 1]. Operation child{i,q) follows the same idea of degree{i), except that, in the 
node where the sum of n'[-] exceeds q, we must descend until the range min-max leaf that contains 
the opening parenthesis of the g-th child. This search is also guided by the n'[-] values of each 
node, and is done also in 0{c) time. Here we need another universal table that tells at which 
position the number of occurrences of the minimum value exceeds some threshold, which requires 
0((2w^ + l)^(2?i;^ + 1) log k) = 0{V^ log w) bits. 

For operations leaf^rank, leaf ^select, ImostJeaf and rmostJeaf , we define a bit-vector Pi[0,n — 
1] such that Pi\i] = 1 P\i] = 1 A P[i + 1] = 0. Then leaf _rank{i) = ranki{Pi,i) 

and leaf select {i) = selecti{Pi,i) hold. The other operations are computed by ImostJeaf (i) = 
selecti{Pi, ranki{Pi,i — 1) + 1) and rmostJeaf (i) = selecti{Pi, ranki{Pi, findclose{i))) . 

We recall the definition of inorder of nodes, which is essential for compressed suffix trees. 

Definition 4 ( |47| ) The inorder rank of an internal node v is defined as the number of visited 
internal nodes, including v, in a left-to-right depth-first traversal, when v is visited from a child of 
it and another child of it will be visited next. 
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Note that an internal node with q children has q — 1 inorders, so leaves and unary nodes have 
no inorder. We define in-rank{i) as the smallest inorder value of internal node i. 

To compute in_rank and in_select, we use another bit- vector P2[0,n — 1] such that = 
1 <;=^ P[i] = A P[i + 1] = 1. The following lemma gives an algorithm to compute the inorder of 
an internal node. 

Lemma 9 ( |47| ) Let i he an internal node, and let j = in_rank{i), so i = in-select{j). Then 

in-rank{i) = ranki{P2, findclose{P,i + 1)) 
in_select{j) = enclose{P, selecti{P2, j) + I) 

Note that in_select{j) will return the same node i for any its degree{i) — 1 inorder values. 

Note that we need not to store Pi and P2 explicitly; they can be computed from P when needed. 
We only need the extra data structures for constant-time rank and select, which can be reduced to 
the corresponding sum and fwd^search operations on the virtual Pi and P2 vectors. 

4.3 Reducing extra space 

Apart from vector P[0, n — 1], we need to store vectors e', m' , M' , and n' . In addition, to implement 
rank and select using sum and fwd.search, we would need to store vectors e'^, e^, m'^, m'^, M'^, 
and which maintain the corresponding values for functions (p and ip. However, note that 
sum{P, (j), 0, i) and sum(P, ip, 0, i) are nondecreasing, thus the minimum/maximum within the chunk 
is just the value of the sum at the beginning/end of the chunk. Moreover, as sum{P,TT,0,i) = 
sum{P, (j), 0, i) — sum{P, ip, 0, i) and sum{P, (p, 0, i) + sum{P, tp, 0, i) = i, it turns out that both 
e^[i] = [ri + e[z])/2 and e^[i\ = {ri — e[i])/2 are redundant. Analogous formulas hold for internal 
nodes. Moreover, any sequence of k consecutive such values can be obtained, via table lookup, from 
the sequence of k consecutive values of e[-], because the rj values increase regularly at any node. 
Hence we do not store any extra information to support (j) and ^. 

If we store vectors e', m', M', and n' naively, we require 0{nc\ogw/w) bits of extra space on 
top of the n bits for P. 

The space can be largely reduced by using a recent technique by Patra§cu [l2]. They define 
an aB-tree over an array A[0, n — 1], for n a power of B, as a complete tree of arity B, storing B 
consecutive elements of A in each leaf. Additionally, a value 93 G $ is stored at each node. This 
must be a function of the corresponding elements of A for the leaves, and a function of the values 
of the children and of the subtree size, for internal nodes. The construction is able to decode the 
B values of ip for the children of any node in constant time, and to decode the B values of A for 
the leaves in constant time, if they can be packed in a machine word. 

In our case, j4 = P is the vector, B = k = s \s our arity, and our trees will be of size 
N = B'^, which is slightly smaller than the u;^ we have been assuming. Our values are tuples 
ip G {—B^,—B'^,0,—B'^)...{B^,B'^,B^,B^) encoding the m, M, n, and e values at the nodes, 
respectively. We give next their result, adapted to our case. 

Lemma 10 (adapted from Thm. 8 in t42j) Let |$| = (2P + l)'^", and B be such that {B + 
l)log(2P + 1) < ^ (thus B = Q{ c\ogw ))- ^'^ aB-tree of size N = B"" with values in <I> can he 
stored using N + 2 hits, plus universal lookup tahles of 0{^/2^) hits. It can ohtain the m, M, n or 
e values of the children of any node, and descend to any of those children, in constant time. The 
structure can be built in 0{N + w^^'^) time, plus 0{y/2^poly{w)) for the universal tahles. 
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The construction time comes from a fusion tree |18J that is used internaUy on 0{w) 

values. It could be reduced to time for any constant e > and navigation time 0{l/e), but we 
prefer to set c > 3/2 so that N = dominates it. 

These parameters still allow us to represent our range min-max trees while yielding the com- 
plexities we had found, as /c = Q{w / {c\ogw)) and N < w^. Our accesses to the range min-max 
tree are either (i) partitioning intervals into 0{ch) subranges, which are easily identified by 
navigating from the root in 0{c) time (as the k children are obtained together in constant time); 
or [ii) navigating from the root while looking for some leaf based on the intermediate m, M, n, or 
e values. Thus we retain all of our time complexities. 

The space, instead, is reduced to N +2+0{y/2^), where the latter part comes from our universal 
tables and those of Lemma [To] (our universal tables become smaller with the reduction from w and 
stoB). Note that our vector P must be exactly of length A^; padding is necessary otherwise. Both 
the padding and the universal tables will lose relevance for larger trees, as seen in the next section. 

The next theorem summarizes our results in this section. 

Theorem 4 On a w-hit word RAM, for any constant c > 3/2, we can represent a sequence P of 
N = B'^ parentheses, for sufficiently small B = &{ ciogw )> computing all operations of TableUl in 
0{c) time, with a data structure depending on P that uses N + 2 hits, and universal tables (i.e., 
not depending on P) that use 0{^/2^) bits. The preprocessing time is 0{N + ^/2^Y>o\y{w)) (the 
latter being needed only once for universal tables) and its working space is 0{N) bits. 

In case we need to solve the operations that build on Pi and P2, we need to represent their 



corresponding (j) functions (as ip is redundant). This can still be done with Lemma 10 using 
$ = (2S + 1)6^ and [B + 1) \og{2B + 1) < ^. Theorem Q applies verbatim. 



5 A data structure for large trees 

In practice, one can use the solution of the previous section for trees of any size, achieving 
C'(^^^logfcn) = 0( iog^-fogiogn ) = C'(logn) time (using k = w/\ogn) for ah operations with 
an extremely simple and elegant data structure (especially if we choose to store arrays m', etc. in 
simple form). In this section we show how to achieve constant time on trees of arbitrary size. 

For simplicity, let us assume in this section that we handle trees of size vu'^ in Section m We 
comment at the end the difference with the actual size 5^ handled. 

For large trees with n > nodes, we divide the parentheses sequence into blocks of length w^. 
Each block (containing a possibly non-balanced sequence of parentheses) is handled with the range 
min-max tree of Section [H 

Let mi, 771-2, • • • 1 ^t', Ml, M2, ■ . . , M^; and ei, 62, ... , Cr] be the minima, maxima, and excess of 
the r = \2n/w'^~\ blocks, respectively. These values are stored at the root nodes of each TmM tree 
and can be obtained in constant time. 



5.1 Forward and backward searches on tt 

We consider extending fwd-search{P, vr, i, d) and bwd-search{P, vr, i, d) to trees of arbitrary size. We 
focus on fwd.search, as bwd.search is symmetric. 

We first try to solve fwd-search{P, TT,i,d) within the block j = [i /w^\ oii. If the answer is within 
block j, we are done. Otherwise, we must look for the first excess d' = ej-i + sum{P, vr, 0, i — 1 — 
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Figure 3: A tree representing the lrm{j) sequences of values mi . . . mg. 



■ {j — 1)) + d in the following blocks (where the sum is local to block j). Then the answer 
lies in the first block r > j such that < d! < M^. Thus, we can apply again Lemma [sj 
starting at [mj+i, Mj+i]: If d! [m,j+i, Mj+i], we must either find the first r > j + 1 such that 

^ or such that > j. Once we find such block, we complete the operation with a local 
fwd-search{P, vr, 0, d' — Cr-i) query inside it. 

The problem is how to achieve constant-time search, for any j, in a sequence of length r. Let 
us focus on left-to-right minima, as the others are similar. 

Definition 5 Let mi,m2, ■ ■ ■ ,mr be a sequence of integers. We define for each 1 < j < t the 
left-to-right minima starting at j as lrm{j) = {jo,ji,j2, ■ ■ •>, where jo = j, jr < jr+i, mj^^^ < mj^, 
and mj^+i . . . mj^_^_^-.i > mj^. 

The following lemmas are immediate. 

Lemma 11 The first element < x after position j in a sequence of integers mi, m2, . . . , m-r is nij^ 
for some r > 0, where jV £ lrm{j). 

Lemma 12 Let lrm{j)[pj] = lrm{j')[pji]. Then lrm{j)[pj + i] = lrm{j')[pji + i] for all i > 0. 
That is, once the Irm sequences starting at two positions coincide in a position, they coincide 



thereafter. Lemma 12 is essential to store all the r sequences lrm{j) for each block j, in compact 
form. We form a tree Tirm, which is essentially a trie composed of the reversed lrm{j) sequences. 
The tree has r nodes, one per block. Block j is a child of block ji = lrm{j)[l] (note lrm{j)[0] = 
jo = j)-, that is, j is a child of the first block ji > j such that ruj-^ < mj. Thus each j-to-root path 



spells out lrm{j), by Lemma 12 We add a fictitious root to convert the forest into a tree. Note 



this structure is called 2d-Min-Heap by Fischer [Mj , who shows how to build it in linear time. 

Example 4 Figure^illustrates the tree built from the sequence {mi . . . mg) = (6, 4, 9, 7, 4, 4, 1, 8, 5). 
Then lrm{l) = (1,2,7), lrm{2) = (2,7), lrm{3) = (3,4,5,7), and so on. 

If we now assign weight mj — mj^ to the edge between j and its parent j'l , the original problem 
of finding the first jV > j such that mj^ < d' reduces to finding the first ancestor jr of node j such 
that the sum of the weights between j and jr exceeds d" = mj — d' . Thus we need to compute 
weighted level ancestors in Tirm- Note that the weight of an edge in Tirm is at most w^. 
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Lemma 13 For a tree with r nodes where each edge has an integer weight in [1, W], after 0{t log "''^ r) 
time preprocessing, a weighted level- ancestor query is solved in 0{t + \/e) time on a Vt(\og{TW))-hit 
word RAM. The size of the data structure is 0{t log T\og{TW) + j^^r^^ + {rWf/'^) hits. 

Proof. We use a variant of Bender and Farach's (©(r log r), 0(1)) algorithm [5]. Let us ignore 
weights for a while. We extract a longest root-to- leaf path of the tree, which disconnects the tree 
into several subtrees. Then we repeat the process recursively for each subtree, until we have a set 
of paths. Each such path, say of length is extended upwards, adding other i nodes towards the 
root (or less if the root is reached). The extended path is called a ladder, and its is stored as an 
array so that level-ancestor queries within a ladder are trivial. This partitioning guarantees that 
a node of height h has also height h in its path, and thus at least its first h ancestors are in its 
ladder. Moreover the union of all ladders has at most 2r nodes and thus requires ©(rlogr) bits. 

For each tree node u , an array of its (at most) logr ancestors at depths depth{v) — 2*, i > 0, is 
stored (hence the ©(r logr)-words space and time). To solve the query leveLancestor{v , d) , where 
d' = depth{v) — d, the ancestor v' at distance d" = 2^^°^'^ ^ from v is computed. Since v' has height 
at least d" , it has at least its first d" ancestors in its ladder. But from v' we need only the ancestor 
at distance d' — d" < d" , so the answer is in the ladder. 

To include the weights, we must be able to find the node v' and the answer considering the 
weights, instead of the number of nodes. We store for each ladder of length i a sparse bitmap of 
length at most iW, where the i-th. 1 left-to-right represents the i-th node upwards in the ladder, and 
the distance between two Is, the weight of the edge between them. All the bitmaps are concatenated 
into one (so each ladder is represented by a couple of integers indicating the extremes of its bitmap). 
This long bitmap contains at most 2r Is, and because weights do not exceed W, at most 2tW Os. 
Using Patra§cu's sparse bitmaps ^42j, it can be represented using 0{T\ogW + ^^gt^^^y^ + (tW)^^^) 
bits and do rank/ select in 0{t) time. 

In addition, we store for each node the log r accumulated weights towards ancestors at distances 
2*, using fusion trees [18j. These can store z keys of i bits in 0{z£) bits and, using 
0{z^'^) preprocessing time, answer predecessor queries in 0(log£ z) time (via an £^/^-ary tree). The 
1/6 can be reduced to achieve 0{z^^'^) preprocessing time and C(l/e) query time for any desired 
constant < e < 1/2. 

In our case this means 0{t log Tlog{TW)) bits of space, 0(r log"*^^*^ r) construction time, and 
0{l/e) access time. Thus we can find in constant time, from each node v, the corresponding 
weighted ancestor v' using a predecessor query. If this corresponds to (unweighted) distance 2*, 
then the true ancestor is at distance < 2*+^, and thus it is within the ladder of v' , where it is 
found using rank/ select on the bitmap of ladders (each node v has a pointer to its 1 in the ladder 
corresponding to the path it belongs to). □ 

To apply this lemma for our problem of computing fwdsearch outside blocks, we have W = w'^ 

and r = Then the size of the data structure becomes Q(' "i°g " _)_ -M 1_ n^/^). By choosing 

e = min(l/2, 1/c), the query time is 0{c + 1) and the preprocessing time is 0{n) for c > 3/2. 

5.2 Other operations 

For computing rmqi and RMQi, we use a simple data structure [1] on the and Mr values, later 
improved to require only 0{t) bits on top of the sequence of values [Ml [15]. The extra space is 
thus 0{n/w^) bits, and it solves any query up to the block granularity. For solving a general query 
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we should compare the minimum/maximum obtained with the result of running queries rmqi 
and RMQi within the blocks at the two extremes of the boundary 

We consider all pairs of matching parentheses {j = findclose{i)) such that i and j belong 
to different blocks. If we define a graph whose vertices are blocks and the edges are the pairs of 
parentheses considered, the graph is outer-planar since the parenthesis pairs nest [26], yet there are 
multiple edges among nodes. To remove these, we choose the tightest pair of parentheses for each 
pair of vertices. These parentheses are called pioneers. Since they correspond to edges of a planar 
graph, the number of pioneers is 0{n/w^). 

For computing child, child-rank, and degree, it is enough to consider only nodes which completely 
include a block (otherwise the query is solved in constant time by considering just two adjacent 
blocks; we can easily identify such nodes using findclose). Furthermore, among them, it is enough 
to consider pioneers: Assume (i, i') contains a whole block but is not a pioneer pair of parentheses. 
Then there exists a pioneer pair contained in {i,i') where j is in the same block of i and j' 

is in the same block of i' . Thus the block contains no children of as all descend from 
Moreover, all the children of (z, i') start either in the block of i or in the block of i' , since (j, j') or an 
ancestor of it is a child of {i,i'). So again the operations are solved in constant time by considering 
two blocks. Such cases can be identified by doing findclose on the last child of i starting in its block 
and seeing if that child closes in the block of i' . 

Let us call marked the nodes to consider (that is, pioneers that contain a whole block). There 
are 0{n/w'^) marked nodes, thus for degree we can simply store the degrees of marked nodes using 
( " '°f " ) bits of space, and the others are computed in constant time as explained. 

For child and child^rank, we set up a bitmap C[0, 2n — 1] where marked nodes v are indicated 
with C[v] = 1, and preprocess C for rank queries so that satellite information can be associated 
to marked nodes. Using again Patra§cu's result j42j . vector C can be represented in at most 
^ log('w^) + 0{^^ + n3/4) bits, so that access and operation rank can be computed in 0{t) time. 

We will focus on children of marked nodes placed at the blocks fully contained in the nodes, as 
the others are in at most the two extreme blocks and can be dealt with in constant time. Note a 
block is fully contained in at most one marked node. 

For each marked node v we store a list formed by the blocks fully contained in v, and the 
marked nodes children of in left-to-right order of P. The blocks store the number of children of 
V that start within them, and the children marked nodes store simply a 1 (indicating they contain 
1 child of v). All also store their position inside the list. The length of all the sequences adds up 
to 0{n/w^) because each block and marked node appears in at most one list. Their total sum of 
children is at most n, for the same reason. Thus, it is easy to store all the number of children as 
gaps between consecutive Is in a bitmap, which can be stored within the same space bounds of the 
other bitmaps in this section {0{n) bits, 0{n/w^) Is). 

Using this bitmap, child and child-rank can easily be solved using rank and select. For child{v, q) 
on a marked node v we start using p = ranki{Cv, selecto{Cy, q)) on the bitmap Cy of v. This tells 
the position in the list of blocks and marked nodes of v where the q-th child of v lies. If it is a 
marked node, then that node is the child. If instead it is a block v' , then the answer corresponds 
to the q'-th minimum within that block, where q' = q — rankQ{selecti(Cv,p)). (Recall that we first 
have to see if child{v, q) lies in the block of v or in that of findclose{v), using a within-block query 
in those cases, and otherwise subtracting from q the children that start in the block of v.) 

For child-rank{u) , we can directly store the answers for marked blocks u. Else, it might be 
that V = parent{u) starts in the same block of u or that findclose{v) is in the same block of 
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findclose(u) , in which case we solve child-rank{u) with an in-block query and the help of degree{v). 
Otherwise, the block where u belongs must be in the list of v, say at position pu- Then the answer 
is ranko^Cv, selecti{Cv,pu)) plus the number of minima in the block of u until u — 1. 

Finally, the remaining operations require just rank and select on P, or the virtual bit vectors 
Pi and P2. For rank it is enough to store the answers at the end of blocks, and finish the query 
within a single block. For selecti{P,i) (and similarly for selecto and for Pi and P2), we make up a 
sequence with the accumulated number of Is in each of the r blocks. The numbers add up to 0{n) 
and thus can be represented as gaps of Os between consecutive Is in a bitmap ^[O, 0{n)], which can 
be stored within the previous space bounds. Computing x = ranki{S, selecto{S,i)), in time 0{t), 
lets us know we must finish the query in block x, using its range min-max tree with the local value 
i' = selectQ{S,i) — selecti{S , x) . 

5.3 The final result 

Recall from Theorem |4j that we actually use blocks of size B'^, not w'^, for B = The sum 

of the space for all the block is 2n + 0{n/B'^), plus shared universal tables that add up to 0{\/2^) 
bits. Padding the last block to size exactly P^ adds up another negligible extra space. 

On the other hand, in this section we have extended the results to larger trees of n nodes, 
adding time (D{t) to the operations. By properly adjusting if to P in these results, the overall extra 
space added is ") -)_ ^^^t^ + \/2^ + n^/^) bits. Using a computer word of w = logn 

bits, setting t = c, and expanding P = O(^no^ogn)' S^t that the time for any operation is 0{c) 
and the total space simplifies to 2n + 0( "(';^°f!°gj')^ ). 

Construction time is 0{n). We now analyze the working space for constructing the data struc- 
ture. We first convert the input balanced parentheses sequence P into a set of aB-trees, each of 
which represents a part of the input of length B'^. The working space is 0{B^) from Theorem |4j 
Next we compute marked nodes: We scan P from left to right, and if P[i\ is an opening parenthesis, 
we push i in a stack, and if it is closing, we pop an entry from the stack. At this point it is very 
easy to spot marked nodes. Because P is nested, the values in the stack are monotone. Therefore 
we can store a new value as the difference from the previous one using unary code. Thus the values 
in the stack can be stored in 0{n) bits. Encoding and decoding the stack values takes 0{ri) time 
in total. Once the marked nodes are identified, Patra§cu's compressed representation [12] of bit 
vector C is built in 0{n) space too, as it also cuts the bitmap into polylog-sized aB-trees and then 
computes some directories over just C'(n/polylog(n)) values. 

The remaining data structures, such as the Irm sequences and tree, the lists of the marked 
nodes, and the bitmaps, are all built on 0{n/B'^) elements, thus they need at most 0{n) bits 
of space for construction. 

By rewriting c — 2 — 5 as c, for any constant (5 > 0, we get our main result on static ordinal 
trees, Theorem [l} 

6 A simple data structure for dynamic trees 

In this section we give a simple data structure for dynamic ordinal trees. In addition to the previous 
query operations, we add now insertion and deletion of internal nodes and leaves. 
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6.1 Memory management 

We store a 0,1 vector P[0, 2n — 1] using a dynamic min-max tree. Each leaf of the min-max tree 
stores a segment of P in verbatim form. The length i of each segment is restricted to L < i < 2L 
for some parameter L > 0. 

If insertions or deletions occur, the length of a segment will change. We use a standard technique 
for dynamic maintenance of memory cells |31j . We regard the memory as an array of cells of length 
2L each, hence allocation is easily handled in constant time. We use L + 1 linked lists s/,, . . . , S2l 
where Sj stores all the segments of length i. All the segments with equal length i are packed 
consecutively, without wasting any extra space in the cells of linked list Si (except possibly at the 
head cell of each list). Therefore a cell (of length 2L) stores (parts of) at most three segments, 
and a segment spans at most two cells. Tree leaves store pointers to the cell and offset where its 
segment is stored. If the length of a segment changes from i to j, it is moved from Si to sj. The 
space generated by the removal is filled with the head segment in Sj, and the removed segment is 
stored at the head of Sj. 

With this scheme, scanning any segment takes 0{L/ logn) time, by processing it by chunks of 
0(logn) bits. This is also the time to compute operations fwd_search, bwd_search, rmqi, etc. on the 
segment, using universal tables. Migrating a node to another list is also done in 0{L/ logn) time. 

If a migration of a segment occurs, pointers to the segment from a leaf of the tree must change. 
For this sake we store back-pointers from each segment to its leaf. Each cell stores also a pointer 
to the next cell of its list. Finally, an array of pointers for the heads of sl, ■ ■ ■ , S2L is necessary. 
Overall, the space for storing a 0,1 vector of length 2n is 2n + 0( "^'^^"' ) bits. 

The rest of the dynamic tree will use sublinear space, and thus we allocate fixed-size memory 
cells for the internal nodes, as they will waste at most a constant fraction of the allocated space. 

6.2 A dynamic tree 

We give a simple dynamic data structure representing an ordinal tree with n nodes using 2n + 
0{n/ logn) bits, and supporting all query and update operations in O(logn) worst-case time. 

We divide the 0,1 vector P[0, 2n— 1] into segments of length from L to 2L, for L = log^ n. We use 
a balanced binary tree for representing the range min-max tree. If a node of the tree corresponds 
to a vector P[i,j], the node stores i and j, as well as e = su'm{P,7r,i, j), m = rmq{P,7r,i, j), 
M = RMQ{P,7T,i, j), and n, the number of minimum values in P[i,j] regarding vr. (Data on (p for 
the virtual vectors Pi and P2 is handled analogously.) 

It is clear that fwdsearch, bwdsearch, rmqi, RMQi, rank, select, degree, child and child-rank 
can be computed in ©(logn) time, by using the same algorithms developed for small trees in 
Section |4] These operations cover all the functionality of Table [T] Note the values we store are 
local to the subtree (so that they are easy to update), but global values are easily derived in a 
top-down traversal. For example, to solve fwd-search{P, vr, i, d) starting at the min-max tree root v 
with children vi and Vr, we first see if j{vi) > i, in which case try first on vi. If the answer is not 
there or j{vi) < i, we try on Vr, changing d to d — e{vi). This will only traverse O(logn) nodes, 
as seen in Section |4j As another example, to compute depth{i) from v we first see if j{vi) > i, in 
which case we continue at vi, otherwise we continue at Vr and add e{vi) to that result. 

Because each node uses ©(logn) bits, and the number of nodes is 0{n/L), the total space 
is 2n + 0(n/logn) bits. This includes the extra ( " " ) term for the leaf data. Note that 
we need to maintain several universal tables that handle chunks of |logn bits. These require 
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0{y/n ■ polylog(?T,)) extra bits, which is neghgible. 

If insertion/deletion occurs, we update a segment, and the stored values in the leaf for the 
segment. From the leaf we step back to the root, updating the values as follows: 



If the length of the segment exceeds 2L, we split it into two and add a new node. If, instead, 
the length becomes shorter than L, we find the adjacent segment to the right. If its length is L, 
we concatenate them; otherwise move the leftmost bit of the right segment to the left one. In this 
manner we can keep the invariant that all segments have length L to 2L. Then we update all the 
values in the ancestors of the modified leaves, as explained. If a balancing operation occurs, we also 
update the values in nodes. All these updates are carried out in constant time per involved node, 
as their values are recomputed using the formulas above. Thus the update time is also ©(logn). 

When [logn] changes, we must update the allowed values for L, recompute universal tables, 
change the width of the stored values, etc. Makinen and Navarro [29j have shown how to do this for 
a very similar case (dynamic rank/ select on a bitmap). Their solution of splitting the bitmap into 
three parts and moving border bits across parts to deamortize the work applies verbatim to our 
sequence of parentheses, thus we can handle changes in [logn] without altering the space nor the 
time complexity (except for 0{w) extra bits in the space due to a constant number of system- wide 
pointers, a technicism we ignore). We have one range min-max tree for each of the three parts and 
adapt all the algorithms in the obvious manneij^ 

7 A faster dynamic data structure 

Instead of the balanced binary tree, we use a B-tree with branching factor G(\/logn), as in previous 
work [8]. Then the depth of the tree is 0(logn/ log logn). The lengths of segments is L to 2L 
for L = log^ n/ log logn. The required space for the range min-max tree and the vector is now 
2n + C'(n log logn/ logn) bits (the internal nodes use C'(log'^/^n) bits but there are only 0{ — — ) 



of them). Now each leaf can be processed in time ©(logn/ log logn). 

Each internal node v of the range min-max tree has k children, for yTogn < k < 2-^logn 
(we relax the constants later). Let ci,C2, ■ ■ ■ ,Ck be the children of v, and [^i,ri], . . . , [£A:,rfe] be 
their corresponding subranges. We store (z) the children boundaries ii, (ii) s^[l,A:] and s^[l,/c] 
storing s^/^[i] = sum{P,(j)/'ip,£i,ri), {Hi) e\l,k] storing e[i\ = sum{P,Tr,ii,ri), (iv) 7n[l,A;] storing 
m[i\ = e[i - 1] + rmq{P,7T,ei,ri), M[l,k] storing M\i] = e[i - 1] + RMQ{P,Ti,li,ri), and {v) n[l,k] 
storing in n[i\ the number of times the minimum excess within the i-th child occurs within its 

®One can act as if one had a single range min-max tree where the first two levels were used to split the three parts 
(these first nodes would be special in the sense that their handling of insertions/deletions would reflect the actions 
on moving bits between the three parts). 



i{v),j{v) 



e{vi) + e{vr) 

mm(m{vi), e{vi) + m{vr)) 
max{M{vi),e{vi) + M{vr)) 
n{vi) if m{vi) < e{vi) + m{vr), 
n{vr) if m{vi) > e{vi) + m{vr) 
n{vi) + n{vr) otherwise. 
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subtree. Note that the values stored are local to the subtree (as in the simpler balanced binary 
tree version, Section p|) but cumulative with respect to previous siblings. Note also that storing 



s^, and e is redundant, as noted in Section 4.3, but we need s^/^ in explicit form to achieve 



constant-time searching into their values, as it will be clear soon. 

Apart from simple accesses to the stored values, we need to support the following operations 
within any node: 

• p{i): the largest j such that < i (or j = 1). 

• w^i^{i): the largest j such that s<^/^[j — 1] < i (or j = 1). 

• /(i, d): the smallest j > i such that m[j] < d < M[i]. 

• h{i, d): the largest j < i such that m[j] < d < M[j]. 

• r{i,j): the smallest x such that m[x] is minimum in m[i,j]. 

• R{i,j): the smallest x such that m[x] is maximum in m[i,j]. 

• ''^{hj)'- number of times the minimum within the subtrees of children i to j occurs within 
that range. 

• r{i,j, t): the x such that the t-th minimum within the subtrees of children i to j occurs within 
the x-th child. 

• update: updates the data structure upon ±1 changes in some child. 

Simple operations involving rank and select on P are carried out easily with ©(log n/ log log n) 
applications oi p{i) and w^/^{i). For example depth{i) is computed, starting from the root node, 
by finding the child j = p{i) to descend, then recursively computing depth{i — ij) on the j-th child, 
and finally adding e[j — 1] to the result. Handling for Pi and P2 is immediate; we omit it. 

Operations fwd_search/bwd-search can be carried out via ©(log n/ log log n) applications of 
/(i, d.) /b{i, d). Recalling Lemma|4| the interval of interest is partitioned into 0(\/logn-logn/ log log n) 
nodes of the B-tree, but these can be grouped into ©(log n/ log log n) sequences of consecutive sib- 
lings. Within each such sequence a single f{i,d)/b{i,d) operation is sufficient. For example, for 
fwd_search{i, d), let us assume d is a global excess to find (i.e., start with d d + depth{i) — 1). We 
start at the root v of the range min-max tree, and compute j = p(i), so the search starts at the 
j-th child, with the recursive query fwd-search{i — ij,d — e[j — 1]). If the answer is not found in 
that child, query j' = f{j + 1, d) tells that it is within child j' . We then enter recursively into the 
j'-th. child of the node with fwd_search{i — ij', d — e[j' — 1]), where the answer is sure to be found. 

Operations rmqi and RMQi are solved in very similar fashion, using 0(logn/ loglogn) appli- 
cations of r{i, j)/R{i, j). For example, to compute rmq{i,i') (the extension to rmqi is obvious) we 
start with j = p{i) and j' = p{i'). If j = j' we answer with e[j — 1] + rmq{i — — Ij) on the j-th 
child of the current node. Otherwise we recursively compute e[j — 1] + rmq{i — £j,£j^i — £j — 1), 
e[j' — 1] + rmq{0, i' — ij') and, if j -|- 1 < j', m[r{j -\- — 1)], and return the minimum of the two 
or three values. 

For degree we partition the interval as for rmqi and then use m[r{i,j)] in each node to identify 
those holding the global minimum. For each node holding the minimum, n{i,j) gives the number 
of occurrences of the minimum in the node. Thus we apply r{i,j) and n{i,j) ©(log n/ log log n) 
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times. Operation child_rank is very similar, by changing the right end of the interval of interest, as 
before. Finally, solving child is also similar, except that when we exceed the desired rank in the sum 
(i.e., in some node n{i,j) > t, where t is the local rank of the child we are looking for), we find the 
desired min-max tree branch with r{i,j,t), and continue on the child with t ^ t — n{i,r{i, j,t) — 1), 
using one r{i,j,t) operation per level. 



7.1 Dynamic partial sums 

Let us now face the problem of implementing the basic operations. Our first tool is a result by 
Raman et al., which solves several subproblems of the same type. 

Lemma 14 ( [43] ) Under the RAM model with word size G(logn), it is possible to maintain a 
sequence of log'' n nonnegative integers xi,X2,-.. of log n bits each, for any constant < e < 1, 
such that the data structure requires 0{log^~^'' n) bits and carries out the following operations in 
constant time: sum{i) = J2)=iXj, search{s) = max{i, sum{i) < s}, and update{i,5), which sets 
Xi Xi + 6, for — logn < 6 < logn. The data structure also uses a precomputed universal table of 
size 0{n'^ ) bits for any fixed e' > 0. The structure can be built in C'(log'^n) time except the table. 

Then we can store i, s^, and in differential form, and obtain their values via sum. The 
same can be done with e, provided we fix the fact that it can contain negative values by storing 
e[i] + 2!^^°^"^ • i (this works for constant-time sum, yet not for search). Operations p and w^/^ are 
then solved via search on £ and s, respectively. Moreover we can handle ±1 changes in the subtrees 
in constant time as well. In addition, we can store m[i] — e[i — 1] and M[i] — e[i — 1], which depend 
only on the subtree, and reconstruct the values in constant time using sum on e, which eliminates 
the problem of propagating changes in e[i] to m[i + 1, k] and M[i + l,k]. Local changes to m[i] or 
M[i] can be applied directly. 



7.2 Cartesian trees 

Our second tool is the Cartesian tree [5^ I48j . A Cartesian tree for an array /c] is a binary 
tree in which the root node stores the minimum value B[ij], and the left and the right subtrees are 
Cartesian trees for B[l, fj, — 1] and B[fj, + l,k], respectively. If there exist more than one minimum 
value position, then /i is the leftmost. Thus the tree shape has enough information to determine the 
position of the leftmost minimum in any range [i,j]. As it is a binary tree of k nodes, a Cartesian 
tree can be represented within 2k bits using parentheses and the bijection with general trees. It 
can be built in 0{k) time. 

We build Cartesian trees for m[l,k] and for M[1,A;] (this one taking maxima). Since 2k = 
0{^/log n), universal tables let us answer in constant time any query of the form r{i,j) and R{i,j), 
as these depend only on the tree shape as explained. All the universal tables we will use on Cartesian 
trees take C'(2'^(\/^) • polylog(n)) = o(n") for any constant < a < 1. 

We also use Cartesian trees to solve operations f{i,d) and b{i,d). However, these do not 
depend only on the tree shape, but on the actual values m\i, k]. We focus on f{i, d) since b{i, d) is 
symmetric. Following Lemma [5| we first check whether m[{\ < d < M[i\, in which case the answer 
is i. Otherwise, the answer is either the next j such that m\j] < d {if d < m[i]), or M[j] > d (if 



d > M[i]). Let us focus on the case d < m[i], as the other is symmetric. By Lemma 11 , the answer 
belongs to lrm{i), where the sequence is m[l, k]. 



24 



Lemma 15 Let C be the Cartesian tree for m[l,k]. Then lrm(i) is the sequence of nodes of C in 
the upward path from i to the root, which are reached from the left child. 

Proof. The left and right children of node i contain values not smaller than i. All the nodes in 
the upward path are equal to or smaller than i. Those reached from the right must be at the left 
of position i, as they must be either to the left or to the right of all the nodes already seen, and i 
has been seen. Their left children are also to the left of i. Ancestors j reached from the left are 
strictly smaller than i and, by the previous argument, to the right of i, thus they belong to lrm{i). 
Finally, the right descendants of those j are not in lrm{i) because they are after j and equal to or 
larger than m[j]. □ 

The Cartesian tree can have precomputed lrm(i) for each i, as this depends only on the tree 
shape, and thus are stored in universal tables. This is the sequence of positions in m[l, k] that must 
be considered. We can then binary search this sequence, using the technique described to retrieve 
any desired m[j], to compute f{i,d) in ©(log A;) = ©(log log n) time. 

7.3 Complete trees 

We arrange a complete binary tree on top of the n[l, k] values, so that each node of the tree records 
(i) one leaf where the subtree minimum is attained, and (ii) the number of times the minimum 
arises in its subtree. This tree is arranged in heap order and requires ©(log'^^^ n) bits of space. 

A query n{i,j) is answered essentially as in Section |6j We find the 0{logk) nodes that cover 
[i,j], find the minimum m[-] value among the leaves stored in (i) for each covering node (recall 
we have constant-time access to m), and add up the number of times (field (ii)) the minimum of 
m[i,j] occurs. This takes overall 0{logk) time. 

A query r(i,j,t) is answered similarly, stopping at the node where the left-to-right sum of the 
fields (ii) reaches t, and then going down to the leaf x where t is reached. Then the t-th occurrence 
of the minimum in subtrees i to j occurs within the x-th subtree. 

When an m[i] or n[i] value changes, we must update the upward path towards the root of the 
complete tree, using the update formula for n{v) given in Section|6j This is also sufficient when e[i] 
changes: Although this implicitly changes all the m[i + 1, k] values, the local subtree data outside 
the ancestors of i are unaffected. Then the root n{v) value will become an n[i'] value at the parent 
of the current range min-max tree node (just as the minimum of m[l,A;], maximum of M[1,A;], 
excess e[k], etc., which can be computed in constant time as we have seen). 

Since these operations take time 0{logk) = O(loglogn) time, the time complexity of degree, 
child, and child-rank is 0(logn). Update operations {insert and delete) also require 0{logn) time, 
as we may need to update n[-] for one node per tree level. However, as we see later, it is possible 
to achieve time complexity C(logn/loglogn) for insert and delete for all the other operations. 
Therefore, we might choose not to support operations n{i,j) and r{i,j,t) to retain the lower update 
complexity. In this case, operations degree, child, and child^rank can only be implemented naively 
using first_child, next_sibling, and parent. 

7.4 Updating Cartesian trees 

We already solved some simple cases of update, but not yet how to maintain Cartesian trees. When 
a value m[i] or M[i] changes (by ±1), the Cartesian trees might change their shape. Similarly, a 
±1 change in e[i] induces a change in the effective value oi m[i + l,k] and M[i + l,k]. We store m 
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Figure 4: Rotations between i and its parent j when m[i] decreases by 1. The edges between any x 
and its parent are labeled with d[x\ = m[x\ — m[Cparent(x)], if these change during the rotation. 
The d[-] values have already been updated. On the left, when i < j, on the right, when i > j. 

and M in a way independent of e, but the Cartesian trees are built upon the actual values of m 
and M. Let us focus on m, as M is similar. If m[i] decreases by 1, we need to determine if i should 
go higher in the tree. We compare i with its Cartesian tree parent j = Cparent{i) and, if (a) i < j 
and m[i] — m[j] = 0, or if (6) i > j and m[i] — m[j] = —1, we must carry out a rotation with i 
and j. Figure [4] shows the two cases. As it can be noticed, case (6) may propagate the rotations 
towards the new parent of i, as it generates a new distance d — 1 that is smaller than before. 

In order to carry out those propagations in constant time, we store an array d[l,A;], so that 
d[i] = m[i] — m[Cparent{i)] if this is < k + 2, and fc + 2 otherwise. Since A;] requires 0{klogk) = 
0{\/log n log log n) = o(logn) bits of space, it can be manipulated in constant time using universal 
tables: With d[\, k] and the current Cartesian tree as input, a universal table can precompute the 
outcome of the changes in d\;] and the corresponding sequence of rotations triggered by the decrease 
of m[i\ for any i, so we can obtain in constant time the new Cartesian tree and the new table k]. 
The limitation of values up to A; + 2 is necessary for the table fitting in a machine word, and its 
consequences will be discussed soon. 

Similarly, if m[i] increases by 1, we must compare i with its two children: (a) the difference 
with its left child cannot fall below 1 and (5) the difference with its right child cannot fall below 
0. Otherwise we must carry out rotations as well, depicted in Figure [5] While it might seem that 
case (6) can propagate rotations upwards (due to the d — 1 at the root), this is not the case because 
d had just been increased as m[i] grew by 1. In case both (a) and (6) arise simultaneously, we 
must apply the rotation corresponding to (6) and then that of (a). No further propagation occurs. 
Again, universal tables can precompute all these updates. 

For changes in e[i], the universal tables have precomputed the effect of carrying out all the 
changes in m[i + 1, /c] , updating all the necessary d[\, k] values and the Cartesian tree. This is 
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Figure 5: Rotations between i and its children when m[i] increases by 1. The edges between any x 
and its parent are labeled with d[x] = m[x] — m[Cparent(x)], if these change during the rotation. 
The d[-] values have already been updated. On the left (right), when the edge to the left (right) 
child becomes invalid after the change in d[-]. 

equivalent to precomputing the effect of a sequence oi k — i successive changes in m[-]. 

Our array d\l, k] distinguishes values between and k + 2. As the changes to the structure of 
the Cartesian tree only depend on whether d[i\ is 0, 1, or larger than 1, and all the updates to d[i\ 
are by ±1 per operation, we have sufficient information in d[-] to correctly predict any change in 
the Cartesian tree shape for the next k updates. We refresh table fast enough to ensure that 
no value of d\^\ is used for more than k updates without recomputing it, as then its imprecision 
could cause a flaw. We simply recompute cyclically the cells of one per update. That is, 
at the i-th update arriving at the node, we recompute the cell i' = 1 + {i mod k), setting again 
d[i'] = min(A; + 2,m[i'] — m[Cparent{i')\); note that Cparent(i') is computed from the Cartesian 
tree shape in constant time via table lookup. Note the values of m[-] are always up to date because 
we do not keep them in explicit form but with e[i — 1] subtracted (and in turn e is not maintained 
explicitly but via partial sums). 

7.5 Handling splits and merges 

In case of splits or merges of segments or internal range min-max tree nodes, we must insert or 
delete children in a node. To maintain the range min-max tree dynamically, we use Fleischer's data 
structure [16J. This is an (a,26)-tree (for a < 2b) storing n numeric keys in the leaves, and each 
leaf is a bucket storing at most 21og^n keys. It supports constant-time insertion and deletion of a 
key once its location in a leaf is known. 

Each leaf owns a cursor, which is a pointer to a tree node. This cursor traverses the tree 
upwards, looking for nodes that should be split, moving one step per insertion received at the leaf. 
When the cursor reaches the root, the leaf has received at most log„rz insertions and thus it is 
split. Both new leaves are born with their cursor at their common parent. In addition some edges 
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must be marked. Marks are considered when splitting nodes (see Fleischer |16| for details). The 
insertion steps are as follows: 



1. Insert the new key into the leaf B. Let v be the current node where the cursor of B points. 

2. If V has more than b children, split it into vi and V2, and unmark all edges leaving from those 
nodes. If the parent of v has more than b children, mark the edges to vi and V2- 

3. If V is not the root, set the cursor to the parent of v. Otherwise, split B into two halves, and 
let the cursor of both new buckets point to their common parent. 

To apply this to our data structure, let a = \/log n, b = 2^/^ogn. Then the height of the tree is 
O (log n/ log log n), and each leaf should store 0(logn/loglogn) keys. Instead our structure stores 
0(log^ n/ loglogn) bits in each leaf. If Fleischer's structure handles 0(logn)-bit numbers, it turns 
out that the leaf size is the same in both cases. The difference is that our insertions are bitwise, 
whereas Fleischer's insertions are number-wise (that is, in packets of 0{logn) bits). Therefore 
we can use their same structure, yet sparing the cursor moves so that one out of logn insertions 
triggers a move (indeed one can only split a leaf if it has actually exceeded size 2L). Marking and 
unmarking of children edges is easily handled in constant time by storing a bit-vector of length 26 
in each node. 



Fleischer's update time is constant. Ours is 0{^/logn) because, if we split a node into two, 
we fully reconstruct all the values in those two nodes and their parent. This can be done in 



0{k) = 0{\/logn) time, as the structure of Lemma 14, the Cartesian trees, and the complete trees 
can be built in linear time. Nevertheless this time is dominated by the 0(logn/loglogn) cost of 
inserting a bit at the leaf. 

Deletions at leaves are handled so that they have always between L and 2L bits. Deletion of 
children of internal nodes may make the node arity fall below a. This is handled as in Fleischer's 
structure, by deamortized global rebuilding. This increases only the sublinear size of the range 
min-max tree; the leaves are not affected. As a consequence, our tree arities are in the range 
1 < k < 4^/\ogn. 

7.6 The final result 

We have obtained the following result. 

Lemma 16 For a 0,1 vector of length 2n, there exists a data structure using 2n+(!?(nloglogn/logn) 
bits supporting fwd-search and bwdsearch in O(logn) time, and updates and all other queries except 
degree, child, and child.rank, in ©(log n/ log log n) time. Alternatively, degree, child, child-rank, 
and updates can be handled in O(logn) time. 

The complexity of fwdsearch and bwdsearch is not completely satisfactory, as we have reduced 
many operators to those. To achieve better complexities, we note that most operators that reduce 
to fwd-search and bwd^search actually reduce to the less general operations findclose, findopen, and 
enclose on parentheses. Those three operations can be supported in time O (log n/ log log n) by 
adapting the technique of Chan et al. |8j. They use a tree of similar layout as ours: leaves storing 



0(log^ n/ log log n) parentheses and internal nodes of arity k = O(ylogn), where Lemma 14 is used 
to store seven arrays of numbers recording information on matched and unmatched parentheses on 
the children. Those are updated in constant time upon parenthesis insertions and deletions, and 
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are sufficient to support the three operations. They report 0{n) bits of space because they do not 
use a mechanism hke the one we describe in Section [6. 1| for the leaves; otherwise their space would 
be 2n + ©(n log log n/ log n) as well. Note, on the other hand, that they do not achieve the times 
we offer for lea and related operations. 

This completes the main result of this section. Theorem [2j 

7.7 Updating whole subtrees 

We face now the problem of attaching and detaching whole subtrees. Now we assume logn is 
fixed to some sufficiently large value, for example logn = w, the width of the systemwide pointers. 
Hence, no matter the size of the trees, they use segments of the same length, and the times are a 
function of w and not of the actual tree size. 

Now we cannot use Fleischer's data structure [16], because a detached subtree could have 
dangling cursors pointing to the larger tree it belonged. As a result, the time complexity for insert 
or delete changes to C'(log^''^ n/ log logn). To improve it, we change the degree of nodes in the 
range min-max tree from G(\/logn) to 0(log'^n) for any fixed constant e > 0. This makes the 
complexity of insert and delete ©(Mog^"*""^ n/ log logn) = ©(log^"*""^ n), and multiplies all query 
time complexities by the constant 0{\/e). 

First we consider attaching a tree Ti to another tree T2, that is, Ti becomes a subtree rooted at 
anodei;ofT2. Here u can be either an internal node or a leaf. Let Pi [0, 2ni — 1] and P2[0, 2n2 — 1] be 
the BP sequence of Ti and r2, respectively. Then this attaching operation corresponds to creating 
a new BP sequence P' = P2[0,p]Pi[0, 2ni — l]P2[p + 1, 2n2 — 1] where p and p +1 are positions of 
parentheses for siblings of the root of Ti in the new tree if v is an internal node, or p and p + 1 are 
the positions for v \i v \s a, leaf. 

If p and p+1 belong to the same segment, we cut the segment into two, say Pi = P[^,p] and 
Pr = P[p + 1, r\. If the length of Pi (Pr) is less than L, we concatenate it to the left (right) segment 
of it. If its length exceeds 2L, we split it into two. We also update the upward paths from Pi and 
Pr to the root of the range min-max tree for r2 to reflect the changes done at the leaves. 

Now we merge the range min-max trees for Ti and T2 as follows. Let hi be the height of the 
range min-max tree of Ti, and /12 be the height of the lea, say v, between Pi and Pr in the range 
min-max tree of T2. If /i2 > hi then can simply concatenate the root of Ti at the right of the 
ancestor of Pi of height hi, then split the node if it has overflowed, and finish. 

If h2 < hi, we divide v into vi and Vr , so that the rightmost child of vi is an ancestor of P/ and 
the leftmost child of Vr is an ancestor of Pr- We do not yet care about vi or Vr being too small. 
We repeat the process on the parent of v until reaching the height /12 = hi + 1. Let us call u 
the ancestor where this height is reached (we leave for later the case where we split the root of r2 
without reaching the height hi + 1). 

Now we add Ti as a child of u, between the child ancestor of Pi and that ancestor of Pr- All the 
leaves have the same depth, but the ancestors of P/ and of Pr at heights /i2 to hi might be underfull 
as we have cut them arbitrarily. We glue the ancestor of height h of Pi with the leftmost node of 
height h of Ti, and that of Pr with the rightmost node of Ti, for all h2 < h < hi. Now there are 
no underfull nodes, but they can have overflowed. We verify the node sizes in both paths, from 
height h = h2 to /ii -|- 1, splitting them as necessary. At height /12 the node can be split into two, 
adding another child to its parent, which can thus be split into three, adding in turn two children 
to its parent, but from there on nodes can only be split into three and add two more children to 
their parent. Hence the overall process of fixing arities takes time ©(Mog^"*"*^ n/ log logn). 
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If node u does not exist, then Ti is not shorter than T2. In this case we have divided T2 into a 
left and right part. Let /12 be the height of T2. We attach the left part of T2 to the leftmost node 
of height /i2 in Ti , and the right part of T2 to the rightmost node of height /12 in Ti . Then we fix 
arities in both paths analogously as before. 

Detaching is analogous as well. After splitting the leftmost and rightmost leaves of the area to 
be detached, let Pi and Pr the leaves of T preceding and following the leaves that will be detached. 
We split the ancestors of Pi and Pr until reaching their lea, let it be v. Then we can form a new 
tree with the detached part and remove it from the original tree T. Again, the paths from Pi and 
Pf. to V may contain underfull nodes. But now Pi and Pr are consecutive leaves, so we can merge 
their ancestor paths up to v and then split as necessary. 

Similarly, the leftmost and rightmost path of the detached tree may contain underfull nodes. 
We merge each node of the leftmost (rightmost) path with its right (left) sibling, and then split 
if necessary. The root may contain as few as two children. Overall the process takes ©(log^^*^ n) 
time. 



8 Improving dynamic compressed sequences 

The techniques we have developed along the paper are of independent interest. We illustrate 
this point by improving the best current results on sequences of numbers with sum and search 
operations, dynamic compressed bitmaps, and their many byproducts. 



8.1 Codes, Numbers, and Partial Sums 

We prove now Lemma[T]on sequences of codes and partial sums, this way improving previous results 
by Makinen and Navarro [29] and matching lower bounds [40j. 

Section [7] shows how to maintain a dynamic bitmap P supporting various operations in time 
O (log n/ log log n), including insertion and deletion of bits (parentheses in P). This bitmap P will 
now be the concatenation of the (possibly variable-length) codes Xj. We will ensure that each leaf 
contains a sequence of whole codes (no code is split at a leaf boundary). As these are of O(logn) 
bits, we only need to slightly adjust the lower limit L to enforce this: After splitting a leaf of length 
2L, one of the two new leaves might be of size L — O(logn). 

We process a leaf by chunks of 6 = ^ log n bits: A universal table (easily computable in 
C'(-^/npolylog(n)) time and space) can tell us how many whole codes are there in the next b 
bits, how much their /(•) values add up to, and where the last complete code ends (assuming we 
start reading at a code boundary). Note that the first code could be longer than 6, in which case 
the table lets us advance zero positions. In this case we decode the next code directly. Thus in 
constant time (at most two table accesses plus one direct decoding) we advance in the traversal 
by at least h bits. If we surpass the desired position with the table we reprocess the last Oilogn) 
codes using a second table that advances by chunks of 0(-\/logn) bits, and finally process the last 
0{^J\og n) codes directly. Thus in time ©(log n/ log log n) we can access a given code in a leaf 
(and subsequent ones in constant time each), sum the /(•) values up to some position, and find the 
position where a given sum s is exceeded. We can also easily modify a code or insert /delete codes, 
by shifting all the other codes of the leaf in time O (log n/ log log n). 



In internal nodes of the range min-max tree we will use the structure of Lemma 14 to maintain 



the number of codes stored below the subtree of each child of the node. This allows determining 
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in constant time the child to follow when looking for any code Xj, thus access to any codes Xi . . . xj 
is supported in time 0(logn/loglogn + j — i). 

When a code is inserted/deleted at a leaf, we must increment /decrement the number of codes in 
the subtree of the ancestors up to the root; this is supported in constant time by Lemma [l4j Splits 
and merges can be caused by indels and by updates. They force the recomputation of their whole 
parent node, and Fleischer's technique is used to ensure a constant number of splits/merges per 
update. Note we are inserting not individual bits but whole codes of O(logn) bits. This can easily 
be done, but now C(logn/ loglogn) insertions/updates can double the size of a leaf, and thus we 
must consider splitting the leaf every time the cursor returns to it (as in the original Fleischer's 
proposal, not every logn times as when inserting parentheses), and we must advance the cursor 
upon insertions and updates. 

For supporting sum and search we also maintain at each node the sum of the /(•) values of 
the codes stored in the subtree of each child. Then we can determine in constant time the child 
to follow for search, and the sum of previous subtrees for sum. However, insertions, deletions and 
updates must alter the upward sums only by O(logn) so that the change can be supported by 
Lemma [T4l within the internal nodes in constant time. 



8.2 Dynamic bitmaps 

Apart from its general interest, handling a dynamic bitmap in compressed form is useful for main- 
taining satellite data for a sample of the tree nodes. A dynamic bitmap B could mark which nodes 
are sampled, so if the sampling is sparse enough we would like B to be compressed. A rank on 
this bitmap would give the position in a dynamic array where the satellite information for the 
sampled nodes would be stored. This bitmap would be accessed by preorder (prc-rank) on the 
dynamic tree. That is, node v is sampled iff B [pre -rank (v)] = 1, and if so, its data is at posi- 
tion ranki{B , pre .rank (v)) in the dynamic array of satellite data. When a tree node is inserted or 
deleted, we need to insert/delete its corresponding bit in B. 

In the following we prove the next lemma, which improves and indeed simplifies previous results 
[HI [29]; then we explore several byproducts. 

Lemma 17 We can store any bitmap B[0,n — 1] within nHQ{B) + 0{nloglog n/ logn) bits of space, 
while supporting the operations rank, select, insert, and delete, all in time O (logn/ log logn). We 
can also support attachment and detachment of contiguous bitmaps within time 0{log^'^'^n) for any 
constant e > 0, yet now logn is a maximum fixed value across all the operations. 

To achieve zero-order entropy space, we use Raman et al.'s (c, o) encoding [H]: The bits are 
grouped into small chunks of 6 = bits, and each chunk is represented by two components: the 
class Ci, which is the number of bits set, and the offset Oi, which is an identifier of that chunk within 
those of the same class. Raman et al. show that, while the \ci\ lengths add up to 0{n log log n/ log n) 
extra bits, the \oi\ = log (^^) components add up to nHQ{B) + 0{n/ logn) bits. 

We plan to store whole chunks in leaves of the range min-max tree. A problem is that the 
insertion or even deletion of a single bit in Raman et al.'s representation can up to double the size 
of the compressed representation of the segment, because it can change all the alignments. This 
occurs for example when moving from 0^ 1^ O'' l^ . . to 10^"^ 01^"^ lO^^^ 01^"^ . . ., where we switch 
from all Ci = 0/& and \oi\ = 0, to all Cj = 1/6 — 1, and \oi\ = [log 6]. This problem can be dealt 
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with (laboriously) on binary trees |29| I22j. but not on our A;-ary tree, because Fleischer's scheme 
does not allow leaves being partitioned often enough. 

We propose a different solution that ensures that an insertion cannot make the leaf's physical 
size grow by more than O(logn) bits. Instead of using the same b value for all the chunks, we 
allow any 1 < 6j < ft. Thus each chunk is represented by a triple (6j,Cj,Oi), where Oj is the offset 
of this chunk among those of length bi having q bits set. To ensure ©(n log log n/ log n) space 
overhead over the entropy, we state the invariant that any two consecutive chunks i, i + 1 must 
satisfy bi + bi+i > b. Thus there are 0{n/b) chunks and the overhead of the bi and Cj components, 
representing each with [log(6 + 1)] bits, is ©(n log 6/6). It is also easy to see that the inequality 
[38] Eloil =Eriog©l =\ogIl('^)+0{n/\ogn)<\og{'^)+0{n/\ogn)=nH^{B) + 0{n/\ogn) 
holds, where m is the number of Is in the bitmap. 

To maintain the invariant, the insertion of a bit us processed as follows. We first identify the 
chunk {bi,Ci,Oi) where the bit must be inserted, and compute its new description (6^,c^,o'j). If 
b'i > b, we split the chunk into two, (6;, q, o/) and {br, Cr, Or), for 6;, 6r. = 6'j/2 ± 1. Now we check left 
and right neighbors (6i_i, Ci_i, Oj_i) and (6i+i, Cj+i, Oj+i) to ensure the invariant on consecutive 
chunks holds. If 6j_i + 6; < 6 we merge these two chunks, and if br + 6j+i < 6 we merge these two 
as well. Merging is done in constant time by obtaining the plain bitmaps, concatenating them, and 
reencoding them, using universal tables (which we must have for all 1 < 6j < 6). Deletion of a bit 
is analogous; we remove the bit and then consider the conditions bi-i + 6'j < 6 and b'i + 6j_|_i < 6. It 
is easy to see that no insertion/deletion can increase the encoding by more than 0{logn) bits. 

Now let us consider codes Xi = {bi,Ci,Oi). These are clearly constant-time self-delimiting and 
\xi\ = O(logn), so we can directly use Lemma [T] to store them in a range min-max tree within 
n' + 0(n'loglogn'/logn') bits, where n' = nHQ{B) + 0(nloglogn/ logn) is the size of our com- 
pressed representation. Since n' < n + ©(nloglogn/logn), we have 0(n' loglogn'/ logn') = 
0(71 log log n/ log n) and the overall space is as promised in the lemma. We must only take care of 
checking the invariant on consecutive chunks when merging leaves, which takes constant time. 

Now we use the sum/ search capabilities of Lemma[Tj Let fb{bi, Ci, Oi) = bi and /c(6i, Cj, Oj) = q. 
As both are always ©(logn), we can have sum/ search support on them. With search on we 
can reach the code containing the j'th bit of the original sequence, which is key for accessing an 
arbitrary bit. For supporting rank we need to descend using search on /{,, and accumulate the 
sum on the /c values of the left siblings as we descend. For supporting select we descend using 
search on /c, and accumulate the sum on the /{, values. Finally, for insertions and deletions of 
bits we first access the proper position, and then implement the operation via a constant number 
of updates, insertions, and deletions of codes (for updating, splitting, and merging our triplets). 
Thus we implement all the operations within time 0(logn/loglogn). 

We can also support attachment and detachment of contiguous bitmaps, by applying essentially 
the same techniques developed in Section 7.7 We can have a bitmap -B'[0,n' — 1] and insert it 
between B[i] and or we can detach any B[i,j\ from B and convert it into a separate bitmap 

that can be handled independently. The complications that arise when cutting the compressed 
segments at arbitrary positions are easily handled by splitting codes. Zero-order compression is 
retained as it is due to the sum of the local entropies of the chunks, which are preserved (small 
resulting segments after the splits are merged as usual). 
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8.3 Sequences and Text Indices 

We now aim at maintaining a sequence S[0, n — 1] of symbols over an alphabet [1, a], so that we can 
insert and delete symbols, and also compute symbol rankc{S, i) and selectc{S, i), for 1 < c < a. This 
has in particular applications to labeled trees: We can store the sequence S of the labels of a tree 
in preorder, so that S [pre .rank (i)] is the label of node i. Insertions and deletions of nodes must be 
accompanied with insertions and deletions of their labels at the corresponding preorder positions, 
and this can be extended to attaching and detaching subtrees. Then we not only have easy access 
to the label of each node, but can also use rank and select on S to find the r-th descendant node 
labeled c, or compute the number of descendants labeled c. If the balanced parentheses represent 
the tree in DFUDS format [6] , we can instead find the first child of a node labeled c using select. 

We divide the sequence into chunks of maximum size b = ^ log^. n symbols and store them 
using an extension of the (ci,Oj) encoding for sequences pLSj. Here Cj = [cj , . . . , of), where cf is 
the number of occurrences of character a in the chunk. For this code to be of length O(logn) we 
need a = O (log n / log lo g n) ; more stringent conditions will arise later. To this code we add the bi 



component as in Section 8.2 This takes nHQ{S) + ( " ) bits of space. In the range min-max 
tree nodes, which we agam assume to hold 0(log^n) children for some constant < e < 1, instead 



of a single fc function as in Section 8.2 we must store one fa function for each a G [1,<7], requiring 



extra space Symbol rank and select are easily carried out by considering the proper 

fa function. Insertion and deletion of symbol a is carried out in the compressed sequence as before, 
and only and fa sums must be incremented/decremented along the path to the root. 

In case a leaf node splits or merges, we must rebuild the partial sums for all the a functions 



fa (and the single function fh) of a node, which requires ©(alog^n) time. In Section 7.5 we have 
shown how to limit the number of splits/merges to one per operation, thus we can handle all the 
operations within 0(logn/loglogn) time as long as c = ©(log^"*^ n/ log log n). This, again, greatly 
simplifies the solution by Gonzalez and Navarro |2^ . which used a collection of searchable partial 
sums with indels. 

Up to here, the result is useful for small alphabets only. Gonzalez and Navarro |22j handle 



larger alphabets by using a multiary wavelet tree (Section 2.1). Recall this is a complete r-ary 
tree of height h = [logger] that stores a string over alphabet [l,r] at each node. It solves all the 
operations (including insertions and deletions) by h applications of the analogous operation on the 
sequences over alphabet [l,r]. 

Now we set r = log^~"^ n/ log logn, and use the small-alphabet solution to handle the sequences 
stored at the wavelet tree nodes. The height of the wavelet tree Is h = O {l + (^i-^°\og\o^ri ) • "^^^ 
zero-order entropies of the small-alphabet sequences add up to that of the original sequence and the 
redundancies add up to O The operations time is O ( (l + (i^^iSfi^))- 

By slightly altering e, we achieve the first part of Theorem |3| where the 0{a\og^ n) term owes to 
representing the wavelet tree itself, which has 0{a /r) nodes. 

For the second part, the arity of the nodes fixed to 0(log^n) allows us attach and detach 
substrings in time 0{r\o^~^^ n) on a sequence with alphabet size r. This has to be carried out on 
each of the 0{(T/r) wavelet tree nodes, reaching overall complexity 0(0" log^"^*^ n). 

The theorem has immediate application to the handling of compressed dynamic text collections, 
construction of compressed static text collections within compressed space, and construction of the 
Burrows- Wheeler transform (BWT) within compressed space. We state them here for completeness; 
for their derivation refer to the original articles |29l [22] . 
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The first result refers to maintaining a collection of texts within high-entropy space, so that one 
can perform searches and also insert and delete texts. Here Hh refers to the h-th order empirical 
entropy of a sequence, see e.g. Manzini [30]. We use sampling step log^nloglogn to achieve it. 

Theorem 5 There exists a data structure for handling a collection C of texts over an alphabet [1, a] 
within size nHh{C) + o{nloga) + 0{a^^^logn + m\ogn + w) bits, simultaneously for all h. Here n 
is the length of the concatenation ofm texts, C = TiO T2 • • • T^, and we assume that a = o{n) is 
the alphabet size and w = r2(logn) is the machine word size under the RAM model. The structure 
supports counting of the occurrences of a pattern P in C'(|-P| io^^g„ (l + iogiog„ )) time, and inserting 
and deleting a text T in 0(logn + \ T\ i^^f^^ (1 + \og\ogn )) After counting, any occurrence can 

be located in time (1 + '"og j" ))- ^^^2/ substring of length £ from any T in the collection 

can be displayed in time O ( ( 1 + '°fj°g " ) + £ ^ „ ( 1 + , Jg"^ ^ J ). For h < (alog^n) - 1, for 
any constant < a < 1, the space complexity simplifies to nHh(C) + o(n log cr) + 0{mlogn + w) 
bits. 

The second result refers to the construction of the most succinct self-index for text within the 
same asymptotic space required by the final structure. This is tightly related to the construction 
of the BWT, which has many applications. 

Theorem 6 The Alphabet- Friendly FM-index fl^, as well as the BWT of a text r[0,n - 1] 
over an alphabet of size a, can be built using nHh{T) + o{nloga) bits, simultaneously for all 
h < (alog^n) - 1 and any constant <a <l, in time 0(n ^"^^^ (1 + Jg°fogn ))- 

On polylog-sized alphabets, we build the BWT in o(n log n) time. Even on a large alphabet 
a = @{n), we build the BWT in o(n log^ n) time. This slashes by a log log n factor the corresponding 
previous result |22j . Other previous results that focus in using little space are as follows. Okanohara 
and Sadakane [37] achieved optimal C(n) construction time with ©(nlogaloglog^n) bits of extra 
space (apart from the nlogcj bits of the sequence). Hon et al. |25| achieve 0(?7- log logo") time and 
0{nloga) bits of extra space. Ours is the fastest construction within compressed space. 

9 Concluding remarks 

We have proposed flexible and powerful data structures for the succinct representation of ordi- 
nal trees. For the static case, all the known operations are done in constant time using 2n + 
C'(n/polylog(Ti)) bits of space, for a tree of n nodes and a polylog of any degree. This signifi- 
cantly improves upon the redundancy of previous representations. The core of the idea is the range 
min-max tree. This simple data structure reduces all of the operations to a handful of primitives, 
which run in constant time on polylog-sized subtrees. It can be used in standalone form to obtain 
a simple and practical implementation that achieves O(logn) time for all the operations. We then 
show how constant time can be achieved by using the range min-max tree as a building block for 
handling larger trees. 

The simple implementation using one range min-max tree has actually been implemented and 
compared with the state of the art over several real- life trees |2]. It has been shown that it is by 
far the smallest and fastest representation in most cases, as well as the one with widest coverage 
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of operations. It requires around 2.37 bits per node and carries out most operations within the 
microsecond on a standard PC. 

For the dynamic case, there have been no data structures supporting several of the usual tree 
operations. The data structures of this paper support all of the operations, including node insertion 
and deletion, in 0{logn) time, and a variant supports most of them in 0(logn/ loglogn) time, 
which is optimal in the dynamic case even for a very reduced set of operations. They are based on 
dynamic range min-max trees, and especially the former is extremely simple and implement able. We 
expect a performance similar to that of the static version in practice. Their flexibility is illustrated 
by the fact that we can support much more complex operations, such as attaching and detaching 
whole subtrees. 

Our work contains several ideas of independent interest. An immediate application to storing 
a dynamic sequence of numbers supporting operations sum and search achieves optimal time 
C(logn/loglogn). Another application is the storage of dynamic compressed sequences achieving 
zero-order entropy space and improving the redundancy of previous work. It also improves the 
times for the operations, achieving the optimal ©(log n/ log log n) for polylog-sized alphabets. This 
in turn has several applications to compressed text indexing. 

Patra§cu and Viola have recently shown that n + n/w^^^^ bits are necessary to compute rank or 
select on bitmaps in time 0{t) in the worst case [39]. This lower bound holds also in the subclass 
of balanced bitmap^^ (i-e., those corresponding to balanced parenthesis sequences), which makes 
our redundancy on static trees optimal as well, at least for some of the operations: Since rank or 
select can be obtained from any of the operations depth, pre^rank, post_rank, preselect, post_select, 
any balanced parentheses representation supporting any of these operations in time 0{c) requires 
2n+2n/'u;®(^) bits of space. Still, it would be good to show a lower bound for the more fundamental 
set of operations findopen, findclose, and enclose. 

On the other hand, the complexity 0(logn/ loglogn) is known to be optimal for several basic 
dynamic tree operations, but not for all. It is also not clear if the redundancy 0{n/r) achieved for 
the dynamic trees, r = log re for the simpler structure and r = log log n/ log re for the more complex 
one, is optimal to achieve the corresponding 0{r) operation times. Finally, it would be good to 
achieve O(logre/loglogre) time for all the operations or prove it impossible. 
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